Shamir's secret sharing method provides a technique to secure data by decomposing the data in to multiple blocks and store them on different machines. It transforms data in to n blocks such that even access to k -1 blocks of data doesn't reveal any information about the original data. This technique is called (k, n) threshold scheme which can be used to secure data without using encryption keys.
It is based on polynomial interpolation on finite field. Data is set as the constant term (Coefficient a0)in a polynomial of degree k - 1. A polynomial is constructed by choosing random k -1 Coefficients (a1, a2..ak-1) from a field and the data is converted to a number (D)and is set as Coefficient a0.
f(x) = a0 + a1x + ... + a^(k-1)x
The field is generated by choosing a prime number greater than D and n. This polynomial is evaluated at n points where D1 = f(1), D2= f(2),..., Dn = f(n) and these n data blocks are stored. So the original data D is not stored but it is calculated from k blocks.
The way I understood this technique is by geometry perspective, using a line. For example if we have given a point (x1, y1) in 2 dimensional space, there can be infinite number of lines pass through it. To calculate the y-intercept we need at least two points to calculate the slope of the line to find it. Without 2 points there are infinite possible y-intercepts.
A line is represented by f(x) = a1 * x + a0 a first degree polynomial. The term y-intercept is calculated by setting x to 0. If we choose (2, n) threshold scheme to secure data, n blocks of data (D1,D2 to Dn) is calculated by evaluating f(x) at x = 1 to n. We need at least two blocks and their indexes to calculate the slope. The index value gives the x coordinate value and the data block value is the y value. With two points (x1, y1) and (x2, y2), the slope can be calculated as
m = (y2 - y1) / (x2 - x1)
The slope is the coefficient a1. Once we calculate a1 now we can calculate a0 by calculating y1 - a1x1 or y2 - a1x2. In the above graph, the slope of the line is 0.5 and a0 is 1.
It is based on polynomial interpolation on finite field. Data is set as the constant term (Coefficient a0)in a polynomial of degree k - 1. A polynomial is constructed by choosing random k -1 Coefficients (a1, a2..ak-1) from a field and the data is converted to a number (D)and is set as Coefficient a0.
f(x) = a0 + a1x + ... + a^(k-1)x
The field is generated by choosing a prime number greater than D and n. This polynomial is evaluated at n points where D1 = f(1), D2= f(2),..., Dn = f(n) and these n data blocks are stored. So the original data D is not stored but it is calculated from k blocks.
The way I understood this technique is by geometry perspective, using a line. For example if we have given a point (x1, y1) in 2 dimensional space, there can be infinite number of lines pass through it. To calculate the y-intercept we need at least two points to calculate the slope of the line to find it. Without 2 points there are infinite possible y-intercepts.
A line is represented by f(x) = a1 * x + a0 a first degree polynomial. The term y-intercept is calculated by setting x to 0. If we choose (2, n) threshold scheme to secure data, n blocks of data (D1,D2 to Dn) is calculated by evaluating f(x) at x = 1 to n. We need at least two blocks and their indexes to calculate the slope. The index value gives the x coordinate value and the data block value is the y value. With two points (x1, y1) and (x2, y2), the slope can be calculated as
m = (y2 - y1) / (x2 - x1)
The slope is the coefficient a1. Once we calculate a1 now we can calculate a0 by calculating y1 - a1x1 or y2 - a1x2. In the above graph, the slope of the line is 0.5 and a0 is 1.