Difference between revisions of "Partitioning Cache Data"
Line 3: | Line 3: | ||
Quorum in the network is 3. (This is down from the previous value of 4... find out why). | Quorum in the network is 3. (This is down from the previous value of 4... find out why). | ||
− | |||
Any sort of partitioning strategy would need to be implemented at the titledb.xml level. Title AUs can be assigned to peers centrally, and each peer should receive a custom titledb.xml file. If LOCKSS is unwilling to support that then there is alternative using local parameters. | Any sort of partitioning strategy would need to be implemented at the titledb.xml level. Title AUs can be assigned to peers centrally, and each peer should receive a custom titledb.xml file. If LOCKSS is unwilling to support that then there is alternative using local parameters. | ||
Line 68: | Line 67: | ||
UAT2 428 2.48 2.49 | UAT2 428 2.48 2.49 | ||
35.76 35.91 -41.32% -39.68%</nowiki> | 35.76 35.91 -41.32% -39.68%</nowiki> | ||
− | |||
''1 + 5'' with no additional nodes | ''1 + 5'' with no additional nodes | ||
Line 84: | Line 82: | ||
''1 + 5'' with 4 additional nodes | ''1 + 5'' with 4 additional nodes | ||
− | |||
<nowiki>au_host AUCount ContentSize in TB DiskCost in TB | <nowiki>au_host AUCount ContentSize in TB DiskCost in TB | ||
ADAH 395 1.95 1.96 -49.23% -61.80% | ADAH 395 1.95 1.96 -49.23% -61.80% | ||
Line 99: | Line 96: | ||
UAT2 377 1.83 1.84 | UAT2 377 1.83 1.84 | ||
30.65 30.78 -47.57% -46.87%</nowiki> | 30.65 30.78 -47.57% -46.87%</nowiki> | ||
+ | |||
+ | = Distribution Algorithms = | ||
+ | |||
+ | TBD... | ||
+ | |||
+ | Sample data was run using an static array of nodes, shuffled with a Fisher-Yates shuffle, and least used node put in first array position. | ||
+ | |||
+ | <nowiki>string[] _nodes = { "ADAH", "AUB", "UAT", "UAB", "BPL", "UNA", "TROY", "SHC" }; | ||
+ | Shuffle(_nodes); // Fisher-Yates Shuffle | ||
+ | |||
+ | //string _leastUsedNode = GetLeastUsedNodeByAUCount(); | ||
+ | //string _leastUsedNode = GetLeastUsedNodeByContentSize(); | ||
+ | string _leastUsedNode = GetLeastUsedNodeByDiskCost(); | ||
+ | |||
+ | // put at beginning of array | ||
+ | Swap(_nodes, _leastUsedNode); | ||
+ | |||
+ | int _counter = 0; | ||
+ | // insert 6 5 | ||
+ | foreach (string _node in _nodes) | ||
+ | { | ||
+ | if (_counter == 5) break; | ||
+ | if (_node.Equals(_row["au_owner"].ToString())) continue; | ||
+ | |||
+ | // process | ||
+ | counter++; | ||
+ | } | ||
+ | </nowiki> |
Revision as of 08:22, 21 June 2013
Partitioned Cost Reductions
Quorum in the network is 3. (This is down from the previous value of 4... find out why).
Any sort of partitioning strategy would need to be implemented at the titledb.xml level. Title AUs can be assigned to peers centrally, and each peer should receive a custom titledb.xml file. If LOCKSS is unwilling to support that then there is alternative using local parameters.
org.lockss.titleDbs = http://bpldb.bplonline.org/etc/adpn/titledb-local.xml
Implementing a 1 + 6 partitioning strategy can save 12% on average for each network node. 1 + 6 indicates AU owner + 6 additional network nodes. Adding 2 additional nodes to the network can decrease per node storage by an average of 30%. Adding 4 additional nodes and partitioning cache data can save per node storage on average of 41%. This means we could store up to 18 TB of data on 10.6 TB nodes.
Implementing a 1 + 5 which is 6 discrete nodes in the network (double quorum), base storage decrease is 25% with no additional nodes. 1 + 5 with 4 additional nodes achieves a staggering 50% on average per node storage reduction.
All nodes (default configuration)
au_host AUCount ContentSize in TB DiskCost in TB ADAH 778 5.11 5.13 AUB 778 5.11 5.13 BPL 778 5.11 5.13 SHC 778 5.11 5.13 TROY 778 5.11 5.13 UAB 778 5.11 5.13 UAT 778 5.11 5.13 UNA 778 5.11 5.13 40.88 41.04
Does not include vacated publisher AUs (which is between 500 and 600 GB).
1 + 6 no additional nodes
au_host AUCount ContentSize DiskCost count size cost ADAH 667 4.60 4.62 -14.27% -9.99% -9.97% AUB 676 3.83 3.85 -13.11% -25.00% -24.94% BPL 669 4.49 4.51 -14.01% -12.19% -12.15% SHC 666 4.48 4.49 -14.40% -12.41% -12.41% TROY 666 4.29 4.30 -14.40% -16.13% -16.09% UAB 667 4.58 4.60 -14.27% -10.29% -10.26% UAT 767 5.06 5.09 -1.41% -0.89% -0.86% UNA 667 4.43 4.45 -14.27% -13.27% -13.23% 35.76 35.91 -12.52% -12.49%
1 + 6 with 2 additional nodes
au_host AUCount ContentSize DiskCost count size cost ADAH 519 3.93 3.95 -33.29% -23.00% -22.98% AUB 540 3.21 3.23 -30.59% -37.15% -37.09% BPL 522 3.09 3.11 -32.90% -39.49% -39.44% SHC 519 3.22 3.23 -33.29% -36.97% -36.96% TROY 519 3.26 3.28 -33.29% -36.15% -36.09% UAB 519 3.02 3.03 -33.29% -40.93% -40.89% UAT 751 4.89 4.91 -3.47% -4.40% -4.38% UNA 519 3.26 3.28 -33.29% -36.15% -36.09% ADAH2 518 3.94 3.95 BPL2 519 3.94 3.95 35.76 35.91 -12.52% -12.49%
1 + 6 with 4 additional nodes
au_host AUCount ContentSize in TB DiskCost in TB ADAH 503 2.60 2.61 -35.35% -49.11% AUB 425 3.16 3.17 -45.37% -38.13% BPL 405 2.47 2.48 -47.94% -51.57% SHC 408 2.83 2.84 -47.56% -44.66% TROY 401 3.34 3.35 -48.46% -34.68% UAB 413 2.50 2.51 -46.92% -51.13% UAT 740 4.79 4.81 -4.88% -6.23% UNA 357 2.97 2.98 -54.11% -41.96% ADAH2 407 2.52 2.54 AUB2 501 3.25 3.26 BPL2 457 2.86 2.87 UAT2 428 2.48 2.49 35.76 35.91 -41.32% -39.68%
1 + 5 with no additional nodes
au_host AUCount ContentSize in TB DiskCost in TB ADAH 581 3.83 3.85 -25.32% -24.87% AUB 539 3.97 3.99 -30.72% -22.12% BPL 537 3.72 3.73 -30.98% -27.13% SHC 590 3.36 3.38 -24.16% -34.07% TROY 555 3.79 3.81 -28.66% -25.71% UAB 572 3.68 3.70 -26.48% -27.87% UAT 757 4.92 4.93 -2.70% -3.68% UNA 536 3.34 3.35 -31.11% -34.56% 30.65 30.78 -25.02% -25.00%
1 + 5 with 4 additional nodes
au_host AUCount ContentSize in TB DiskCost in TB ADAH 395 1.95 1.96 -49.23% -61.80% AUB 349 2.90 2.91 -55.14% -43.30% BPL 463 2.77 2.78 -40.49% -45.74% SHC 350 2.94 2.95 -55.01% -42.46% TROY 327 1.83 1.84 -57.97% -64.18% UAB 331 2.10 2.11 -57.46% -58.88% UAT 736 4.85 4.87 -5.40% -5.02% UNA 312 2.37 2.38 -59.90% -53.55% ADAH2 332 2.06 2.07 AUB2 328 2.52 2.53 BPL2 367 2.53 2.54 UAT2 377 1.83 1.84 30.65 30.78 -47.57% -46.87%
Distribution Algorithms
TBD...
Sample data was run using an static array of nodes, shuffled with a Fisher-Yates shuffle, and least used node put in first array position.
string[] _nodes = { "ADAH", "AUB", "UAT", "UAB", "BPL", "UNA", "TROY", "SHC" }; Shuffle(_nodes); // Fisher-Yates Shuffle //string _leastUsedNode = GetLeastUsedNodeByAUCount(); //string _leastUsedNode = GetLeastUsedNodeByContentSize(); string _leastUsedNode = GetLeastUsedNodeByDiskCost(); // put at beginning of array Swap(_nodes, _leastUsedNode); int _counter = 0; // insert 6 5 foreach (string _node in _nodes) { if (_counter == 5) break; if (_node.Equals(_row["au_owner"].ToString())) continue; // process counter++; }