douglib.core¶
Hello!
Created on Mon Aug 26 11:02:21 2013.
A library holding common subroutines and classes that I’ve created.
-
douglib.core._integrate(f, a, b, N=200)¶ Integrate function
ffromatobusingNitertions.Parameters: - f (function) – The function to integrate. Must take a single numeric argument (or more if args 2 through n are optional).
- a (float) –
- b (float) – The limits of the integral
- N (int, optional) – The number of samples to use. Higher numbers yield more accurate values but cost more processing power and memory.
Returns: area – The area under the function.
Return type: float
See also
https()- //helloacm.com/how-to-compute-numerical-integration-in-numpy-python/
Examples
>>> _integrate(np.sin, 0, np.pi/2, 100) 1.0000102809119051
-
douglib.core.array_2d_to_str(array_2d, delim='')¶ Convert a 2D array to a spreadsheet string.
Parameters: - array_2d (list of lists) – The array to convert.
- delim (str, optional) – The delimiter. Defaults to the empty string. Use ‘,’ to make a true CSV string.
Returns: A csv-compatible string.
Return type: str
-
douglib.core.binary_file_compare(file1, file2)¶ Compare two files byte-by-byte.
Parameters: - file1 (str) – The path to the master file
- file2 (str) – The path to the 2nd file.
Returns: failcode – A flag providing information on where the difference is located.
Return type: int
Notes
Fail codes can be:
- 0: files match
- 1: different sizes
- 2: different first or last byte
- 3: different data in statistically significant random sample
- 4: different data in full search
See
significant_subsample()for more information on failcode3.
-
douglib.core.clip(x, min_max, clipval=None)¶ Clip the value
xto x_min or x_max.If
clipvalis defined, then returns those values instead.clipvalmust be a list or tuple of length 2.Parameters: - x (numeric) – The value to clip
- min_max (sequence of numerics, length 2) – The (minimum, maximum) value to return.
- clipval (sequence of length 2, any type, optional) – The items to return when x is outside of (x_min, x_max). This sequence can be made up of any type.
Returns: clipped
Return type: any
Examples
>>> clip(10, (0, 1)) 1 >>> clip(10, (0, 1), clipval=("Zero", "One")) 'One' >>> clip(5.23, (3.24, 8.91)) 5.23
-
douglib.core.convert_rcd_xyd(rcd)¶ Convert a list of
(a, b, data)to(b, a, data).Simply swaps the first two items in each sublist. Also sorts the new list by
xtheny.Parameters: rcd (list of tuples) – The data to convert. Returns: A copy of rcdwith sublist index 0 and 1 swapped, sorted.Return type: list of tuples
-
douglib.core.frange(start, stop, step)¶ Generator that creates an arbitrary-stepsize range.
Creates a list generator that returns
[start, start + step, start + step * 2, ..., stop)Note that the interval is closed-open
[). Thestopvalue is not supposed to be part of the returned list generator.Parameters: - start (numeric) – The number to start at
- stop (numeric) – The number to end at
- step (numeric) – The delta between points
Returns: - frange (generator) – A generator that returns the numbers in the range on demand.
- .. note:: – This function does not accout for floating-point math errors.
This means that there’s a possibliity that rounding the last point
to the
stepprecision will equalstop. See examples.
Examples
>>> list(frange(1.5, 6.5, 0.5)) [1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0]
Floating Point Error:
>>> list(frange(1.2, 1.8, 0.2)) [1.2, 1.4, 1.5999999999999999, 1.7999999999999998]
-
douglib.core.from_engineering_notation(string)¶ Convert a number string with order-of-magnitude suffix to a float.
Parameters: string (string) – The string to convert. Returns: number – The numerical equivalent of string.Return type: float Examples
>>> from_engineering_notation("1.23m") 0.00123 >>> from_engineering_notation("4.5k") 4500.0 >>> from_engineering_notation("-6.84u") -6.84e-06
See also
-
douglib.core.hash_file(file_object, hasher, blocksize=65536)¶ Hash a file using a given hashing type.
Parameters: - file_object (io.IOBase object) – The stream to hash.
- hasher (hashlib.HASH object) – The hasher to use.
- blocksize (int, optional) – The block size to read from
file_object.
Returns: The hash digest of the stream.
Return type: digest
Note
file_objectmust already be opened.Hint
Examples of valid hashers are
hashlib.md5(),hashlib.sha256(), etc.
-
douglib.core.interpolate_1d_array(array, x)¶ Emulate LabVIEW’s
Interpolate 1D Arrayfunction.Takes a fractional index value
xand returns an interpolatedYvalue.Parameters: - array (list) – A 1D list of numeric values.
- x (numeric) – The fractional index to inerpolate to.
Returns: y – The interpolated value.
Return type: float
Notes
This function only performs linear interpolation.
See also
Note
Timing: O(1)
-
douglib.core.max_dist(center, size)¶ Calculate the distance to the farthest corner of a rectangle.
Assumes that the orgin is at
(0, 0).If the rectangle’s center is in Q1, then the upper-right corner is the farthest away from the origin. If in Q2, then the upper-left corner is farthest away. Etc.
Returns the magnitude of the largest distance.
Used primarily for calculating if a die has any part outside of wafer’s edge exclusion.
Parameters: - center (tuple of length 2, numerics) –
(x, y)tuple defining the rectangle’s center coordinates - size (tuple of length 2) –
(x, y)tuple that defines the size of the rectangle.
Returns: dist – The distance from the origin (0, 0) to the farthest corner of the rectangle.
Return type: numeric
See also
- center (tuple of length 2, numerics) –
-
douglib.core.max_dist_sqrd(center, size)¶ Calculate the squared distance to the farthest corner of a rectangle.
Assumes that the orgin is at
(0, 0).Does not take the square of the distance for the sake of speed.
If the rectangle’s center is in the Q1, then the upper-right corner is the farthest away from the origin. If in Q2, then the upper-left corner is farthest away. Etc.
Returns the squared magnitude of the largest distance.
Used primarily for calculating if a die has any part outside of wafer’s edge exclusion.
Parameters: - center (tuple of length 2, numerics) –
(x, y)tuple defining the rectangle’s center coordinates - size (tuple of length 2) –
(x, y)tuple that defines the size of the rectangle.
Returns: dist – The distance from the origin (0, 0) to the farthest corner of the rectangle.
Return type: float
See also
- center (tuple of length 2, numerics) –
-
douglib.core.nearest_indicies(data, x)¶ Find the two array positions (indices) around x.
Parameters: - data (array-like) – A sequence of [x1, x2, ... xn] values
- x (numeric) – The value to to search for in
data
Returns: indices – The indices which surround the value
x. See Notes for more information.Return type: list
Examples
>>> nearest_indicies([1,4,6,8,10,15], 3) [0, 1] >>> nearest_indicies([1,4,6,8,10,15], 6) [2] >>> nearest_indicies([1,4,6,8,6,10], 7) # only returns 1st match [2, 3]
See also
Note
- Timing: O(n)
- If an exact match is found, returns a list of length 1 which contains
the index of the element
x. Otherwise, returns a list of length 2 containing the two indices that surroundx. - If there are more than two possible locations, it only returns the first.
-
douglib.core.normal_cdf(x)¶ Return the probability for a z-score of
x.Parameters: x (float) – The value to.. stuff and things. Returns: The probability that a value below xwill occur.Return type: float References
https://en.wikipedia.org/wiki/Normal_distribution#Cumulative_distribution_function
Examples
>>> round(normal_cdf(1.96), 3) 0.975 >>> round(normal_cdf(1.6448536269514722), 3) 0.95 >>> round(normal_cdf(2.5758293035489004), 3) 0.995 >>> round(normal_cdf(0), 3) 0.5 >>> round(normal_cdf(-1), 3) 0.159
# 68-95-99.7 rule >>> round(normal_cdf(1) - normal_cdf(-1), 2) 0.68 >>> round(normal_cdf(2) - normal_cdf(-2), 2) 0.95 >>> round(normal_cdf(3) - normal_cdf(-3), 3) 0.997
# The probit function should be the inverse of this >>> round(probit(normal_cdf(1)), 2) 1.0 >>> round(probit(normal_cdf(2)), 2) 2.0
-
douglib.core.pick_x_at_y(xy_array, y)¶ Manual linear interpolation at a POI.
Parameters: - xy_array (list) – A list in the format
[(x1,y1), (x2,y2), ...] - y (numeric) – The y value to look for.
Returns: x – The
xvalue for the giveny.Return type: numeric
- xy_array (list) – A list in the format
-
douglib.core.position(array, item)¶ Emulate Mathematica’s
Position[]function as best as possible.Only works on 1D arrays.
Parameters: - array (sequence) – The list of items to search through.
- item (any) – The item to search for.
Returns: indices – The a generator for the index(es) of item in array. Returns an empty generator if
itemis not found.Return type: generator
Examples
>>> list(position([0, 1, 2, 3, 4], 2)) [2] >>> list(position(["a", "B", "C", "d"], "d")) [3] >>> list(position(['1', '1', 'a', 15, 1], '1')) [0, 1]
Note
Timing: O(1)
-
douglib.core.probit(p)¶ Return the probit function at probability
p.Parameters: p (float) – Probability that a value will be drawn from the returned range. Must be between 0 and 1 inclusive. Returns: The value of the probit function at p.Return type: float Notes
This was shamelessly taken from the Scipy source code. I don’t want to deal with getting a scipy requirement working for this project and I only use this bit from it so... I figured I’d make it myself.
Examples
>>> round(probit(0.025), 2) -1.96 >>> round(probit(0.975), 2) 1.96 >>> probit(0.5) 0.0 >>> round(probit(0.95), 12) 1.644853626951
-
douglib.core.rc_to_radius(rc_coord, die_xy, center_rc)¶ Convert a die RC coordinate to a radius.
Parameters: - rc_coord (sequence of ints, length 2) – The
(row, column)grid coordinate die - die_xy (sequence of numerics, length 2) – The die
(x, y)size. Typically in units of mm. - center_rc (sequence of numerics, length 2) – The grid
(row, column)coordinate which defines the origin (center of the wafer).
Returns: radius – The radius of the center of the die in question.
Return type: float
See also
- rc_coord (sequence of ints, length 2) – The
-
douglib.core.rc_to_radius_sqrd(rc_coord, die_xy, center_rc)¶ Convert a die RC coordinate to a radius.
Returns the squared radius for the sake of speed.
Parameters: - rc_coord (sequence of ints, length 2) – The
(row, column)grid coordinate die - die_xy (sequence of numerics, length 2) – The die
(x, y)size. Typically in units of mm. - center_rc (sequence of numerics, length 2) – The grid
(row, column)coordinate which defines the origin (center of the wafer).
Returns: radius – The squared radius of the center of the die in question.
Return type: float
See also
- rc_coord (sequence of ints, length 2) – The
-
douglib.core.rcd_to_2d_array(data, missing=0)¶ Convert an array of tuples to a 2D array (matrix-like).
Takes an array of tuples of (Row (y), column (x), data) and converts it to a 2D array where the element index is the row and column value.
Parameters: - data (list of tuples) – The data to convert, in the format
[(x1, y1, d1), (x2, y2, d2), ...] - missing (any, optional) – The value to replace use for missing points.
Returns: array – The matrix-like array.
Return type: list
Example
>>> data = [[0, 0, 'a'], [0, 1, 'b'], [0, 2, 'c'], ... [1, 0, 'd'], [1, 1, 'e'], [1, 2, 'f'], ... [2, 0, 'g'], [2, 2, 'i'], ... ] >>> rcd_to_2d_array(data, 'X') [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'X', 'i']]
Warning
datamust be sorted by Row (y) then by Column (x) values.- data (list of tuples) – The data to convert, in the format
-
douglib.core.reedholm_die_to_rc(die_name)¶ Convert the Reedholm die name (“x27y54”) to a row-column tuple.
Parameters: die_name (str) – The die name to parse. Returns: The (row, column)grid coordinate.Return type: tuple
-
douglib.core.rescale(x, orig_scale, new_scale=(0, 1))¶ Rescale x to run over a new range.
Rescales x (which was part of scale
original_mintooriginal_max) to run over a range (new_mintonew_max) such that the valuexmaintains position on the new scale. Ifxis outside of xRange, then y will be outside of yRange.Default new scale range is 0 to 1 inclusive.
Parameters: - x (numeric) – The value to rescale.
- orig_scale (sequence of numerics, length 2) – The
(min, max)value thatxtypically ranges over. - new_scale (sequence of numerics, length 2, optional) – The new
(min, max)value that the rescaledxshould reference
Returns: result – The rescaled
xvalueReturn type: float
Examples
>>> rescale(5, (10, 20), (0, 1)) -0.5 >>> rescale(27, (0, 200), (0, 5)) 0.675 >>> rescale(1.5, (0, 1), (0, 10)) 15.0
See also
-
douglib.core.rescale_clip(x, orig_scale, new_scale=(0, 1))¶ Same as
rescale(), but also clips the new data.Any result that is below
new_minor abovenew_maxis return asnew_minornew_max, respectivelyParameters: - x (numeric) – The value to rescale.
- orig_scale (sequence of numerics, length 2) – The
(min, max)value thatxtypically ranges over. - new_scale (sequence of numerics, length 2, optional) – The new
(min, max)value that the rescaledxshould reference
Returns: result – The rescaled
xvalueReturn type: float
Examples
>>> rescale_clip(5, (10, 20), (0, 1)) 0 >>> rescale_clip(15, (10, 20), (0, 1)) 0.5 >>> rescale_clip(25, (10, 20), (0, 1)) 1
See also
-
douglib.core.reservoir_sampling(array, num)¶ Randomly selects a number of elements from array.
Adapted from Wikipedia page on Reservoir Sampling: http://en.wikipedia.org/wiki/Reservoir_sampling
Parameters: - array (list) – The list of items to choose from
- num (int) – The number of elements to choose from
array
Returns: list_subset – A random subset of
arraywhich isnumitems long.Return type: list
Note
Timing: O(n)
-
douglib.core.round_to_multiple(x, y)¶ Round
xto a multiple ofy.Parameters: - x (numeric) – The value to be rounded.
- y (numeric) – The multiplier to round to.
Returns: rounded –
xrounded to the nearest multiple ofyReturn type: numeric
Examples
>>> round_to_multiple(1.1234, 0.1) 1.1 >>> round_to_multiple(4.767, 0.3) 4.8 >>> round_to_multiple(1.1234, 0.32) 1.28 >>> round_to_multiple(-1.1234, 0.06) -1.14
-
douglib.core.significant_sample_size(N, **kwargs)¶ Return the significant sample size.
The significant sample size is the sample size needed to provide a given z-score. (or confidence interval) and margin of error from a population of size
Nand response distributionp. Assumes a normal distribution.Parameters: - N (int) – The population size.
- Z (float, optional [1.96]) – The Z-score for the desired confidence interval. If given,
CImust not be given. Defaults to a confidence interval of 95%. - CI (float, optional [0.95]) – The desired confidence interval. Must be between 0 and 1 inclusive.
If given,
Zmust not be given. Defaults to a Z-score of 1.96. - E (float, optional [0.02]) – The desired margin of error. Must be between 0 and 1 inclusive.
- p (float, optional [0.5]) – Response distribution. This is what the expected response rate is. If you aren’t sure, use 0.5 as that results in the largest sample size. Must be between 0 and 1 inclusive.
Returns: n – The number of samples needed.
Return type: int
Examples
>>> significant_sample_size(1000) 706 >>> significant_sample_size(1000, Z=1.6448, E=0.05) 213 >>> significant_sample_size(1000, Z=1.6448, E=0.1) 63 >>> significant_sample_size(1000, Z=1.6448, E=0.1, p=0.3) 53 >>> significant_sample_size(10000) 1936 >>> significant_sample_size(1000, CI=0.95, E=0.02) 706 >>> significant_sample_size(1000, CI=0.96, E=0.02) 725 >>> significant_sample_size(1000, CI=0.95, E=0.03) 516
Notes
The sample size for the statistically significant random sample is given by:
\[n = \frac{N \times Z^2 \times p(1-p)} {(N-1) E^2 +(Z^2 \times p(1-p))}\]- n = sample size
- N = population size
- Z = z-score for a given confidence interval
- E = margin of error
- p = is the response distribution (what the expected response rate is)
Info from http://www.raosoft.com/samplesize.html which provides the following equations:
\[x = Z^2 \times p(1-p)\]\[n = \frac{(N \times x)}{((N-1) \times E^2 + x)}\]\[E^2 = \frac{(N - n) \times x}{n(N-1)}\]Note that on the website: \(Z(c)^2\), where \(Z\) is a function of \(c\).
Typical Z-scrore / confidence interval values are:
- Z = 1.6448536269514722 -> 90%
- Z = 1.959963984540054 -> 95%
- Z = 2.5758293035489004 -> 99%
and I have no idea how to calculate them.
For a given population size, the margin of error
Etypically has a much stronger effect onnthan the confidence interval does.Example:
>>> # Given a population of 1000, a CI of 95%, and a MoE of 2%: >>> significant_sample_size(1000, CI=0.95, E=0.02) 706 >>> # a 1% change in CI means a 2.5% change in sample size: >>> significant_sample_size(1000, CI=0.94, E=0.02) # 1% change in CI 688 >>> # a 1% change in margin of error means a 27% change in sample size: >>> significant_sample_size(1000, CI=0.95, E=0.03) # 1% change in error 516
See also
-
douglib.core.significant_subsample(array, CI=0.95, E=0.02, p=0.5)¶ Return a subarray that is a statictically significant sampling.
Assumes the original array is the entire population.
See docstring for the significant_sample_size function for more information.
Parameters: - array (sequence) – The array to create a subset of.
- CI (float [0.95]) – The desired confidence interval. Must be between 0 and 1 inclusive.
- E (float [0.02]) – The desired margin of error. Must be between 0 and 1 inclusive.
- p (float [0.5]) – Response distribution. This is what the expected response rate is. If you aren’t sure, use 0.5 as that results in the largest sample size. Must be between 0 and 1 inclusive.
Returns: subarray – A random subset of
arraythat isNitems long, whereNis defined by the input parameters.Return type: sequence
Note
- Timing: O(n)
-
douglib.core.sort_by_column(big_list, *args, **kwargs)¶ Sort a 2D list by columns defined by
args.Will sort by multiple columns if
argsis longer than 1 element.Parameters: - big_list (list) – The list to sort.
- *args (int) – The column(s) to sort by.
- inplace (bool, optional [False]) – If
True, the variable sent tobig_listwill be modified. IfFalse, a copy of the list is made.
Returns: sorted – A copy or reference to the sorted list.
Return type: list
Notes
sort_by_column(A, 3, 1)will sort by the 4th column (index 3) and then by 2nd column (index 1).sort_by_column(A, 1)is the same assort_by_column(A, 1, inplace=False)Sorting in place (
inplace=True) means that the data for the variable that you entered (A) will be modified.inplace=Falsereturns a copy of the 2D array and is the default.Examples
>>> my_array = [[3 ,5], [2, 4], [1, 7]] >>> sort_by_column(my_array, 1) # sort by column 1 (2nd col) and copy [[2, 4], [3, 5], [1, 7]] >>> sort_by_column(my_array, 1, inplace=True) # modifies my_array >>> print(my_array) [[2, 4], [3, 5], [1, 7]]
-
douglib.core.threshold_1d_array(array, y)¶ Emulate LabVIEW’s
Threshold 1D Arrayfunction.Takes a
Yvalue and returns a fractional index for thatYvalue. If the function is not monotomically increasing, it returns the first value found.Parameters: - array (list) – A 1D list of numeric values.
- y (numeric) – The value to search for.
Returns: fractional_index – A fractional index representing the location of
y.Return type: float
See also
Note
Timing: O(n)
-
douglib.core.to_engineering_notation(number, num_digits=5)¶ Convert a float to string with an SI order-of-magnitude suffix.
Caution
This function can reduce significant digits.
Note
- Only uses suffixes that are multiples of 3.
- Always uses smaller of two options.
Parameters: - number (numeric) – The number to convert.
- num_digits (int, optional) – The maximum number of digits to display in
string.
Returns: engr_string – An engineering-formatted string representation of
number.Return type: string
Examples
>>> to_engineering_notation(123456) '123.46k' >>> to_engineering_notation(-0.003216) '-3.216m'
Using
num_digits:>>> to_engineering_notation(1000036, 2) '1M' >>> to_engineering_notation(1000036, 6) '1.00004M'
>>> to_engineering_notation(-0.003216, 1) '-3m' >>> to_engineering_notation(-0.003216, 3) '-3.22m'
>>> to_engineering_notation(32165, 1) '3e+01k' >>> to_engineering_notation(32165, 2) '32k' >>> to_engineering_notation(32165, 3) '32.2k' >>> to_engineering_notation(32165, 4) '32.16k'
-
douglib.core.xyd_to_2d_array(data, missing=0)¶ Convert an array of tuples to a 2D array (matrix-like).
Takes an array of
(x, y, data)tuples and converts it to a 2D array where the element index is the Y and X value.Parameters: - data (list of tuples) – The data to convert, in the format
[(x1, y1, d1), (x2, y2, d2), ...] - missing (any, optional) – The value to replace use for missing points.
Returns: array – The matrix-like array.
Return type: list
Example
>>> data = [[0, 0, 'a'], [0, 1, 'b'], [0, 2, 'c'], ... [1, 0, 'd'], [1, 1, 'e'], [1, 2, 'f'], ... [2, 0, 'g'], [2, 2, 'i'], ... ] >>> xyd_to_2d_array(data, 'X') [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'X', 'i']]
Warning
datamust be sorted by X then by Y values.- data (list of tuples) – The data to convert, in the format
-
douglib.core.z_score_from_confidence_interval(ci)¶ Return a Z-score for a given confidence interval.
Parameters: ci (float) – The confidence intervalue to use. Must be beween 0 and 1 inclusive. Returns: The z-score (the number of standard deviations from the mean) for a symmetric interval. Return type: float Examples
>>> round(z_score_from_confidence_interval(0.95), 12) 1.95996398454 >>> round(z_score_from_confidence_interval(0.90), 12) 1.644853626951 >>> round(z_score_from_confidence_interval(0.975), 12) 2.241402727605