Dealing with large lists of numbers
up vote
0
down vote
favorite
I am looking for references that discuss computer programs handling a large collection of numbers that do not fit in machine memory.
Some context:
I am working with GNU multi precision library aka gmplib (double precision was insufficient) and generating a list of numbers; a lot of the numbers being generated are duplicates (some room for improvement, but still lots of duplicates). One of the small lists has ~300,000,000 unique points but I would like to work with more. Afterwards, I need to search through the list and make sure there are no duplicates near the defined precision, and then some further calculation.
Thoughts on my problem:
Naively, just write to a file and store points there. But then I need to search through the list, so perhaps I should keep the list sorted, and at that point, maybe it's just easier to use a database.
So I'm looking into how other people have handled this, maybe Mathematica handles this easily and transparently (my in-memory version, the kernel died after using all machine memory), maybe a C program interfacing to mysql is the best solution (this is where I am now, and it is slow, I am spending time on DB optimization), maybe "get a machine with more RAM" is the best solution, etc. (of course, arbitrary "best" depending on the problem).
References are preferred (I think this question is somewhat general), or even just comments pointing to the right collection of keywords to search, but I welcome any feedback for my specific problem. Thanks.
sequences-and-series reference-request
add a comment |
up vote
0
down vote
favorite
I am looking for references that discuss computer programs handling a large collection of numbers that do not fit in machine memory.
Some context:
I am working with GNU multi precision library aka gmplib (double precision was insufficient) and generating a list of numbers; a lot of the numbers being generated are duplicates (some room for improvement, but still lots of duplicates). One of the small lists has ~300,000,000 unique points but I would like to work with more. Afterwards, I need to search through the list and make sure there are no duplicates near the defined precision, and then some further calculation.
Thoughts on my problem:
Naively, just write to a file and store points there. But then I need to search through the list, so perhaps I should keep the list sorted, and at that point, maybe it's just easier to use a database.
So I'm looking into how other people have handled this, maybe Mathematica handles this easily and transparently (my in-memory version, the kernel died after using all machine memory), maybe a C program interfacing to mysql is the best solution (this is where I am now, and it is slow, I am spending time on DB optimization), maybe "get a machine with more RAM" is the best solution, etc. (of course, arbitrary "best" depending on the problem).
References are preferred (I think this question is somewhat general), or even just comments pointing to the right collection of keywords to search, but I welcome any feedback for my specific problem. Thanks.
sequences-and-series reference-request
This is a question of programming, not mathematics. You would have the same kind of problems if you were generating character strings instead of multiprecision numbers.
– Somos
Nov 16 at 20:40
@Somos Yes, I'm storing in the database as character strings. But the point of the question is to learn from others who have encountered similar challenges doing similar mathematical work. I am aware of generic data store solutions, but was hoping for something more domain specific and provided some context for that. Perhaps you have suggestions of how I can better ask this question to achieve those kinds of answers?
– Burnsba
Nov 16 at 21:11
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am looking for references that discuss computer programs handling a large collection of numbers that do not fit in machine memory.
Some context:
I am working with GNU multi precision library aka gmplib (double precision was insufficient) and generating a list of numbers; a lot of the numbers being generated are duplicates (some room for improvement, but still lots of duplicates). One of the small lists has ~300,000,000 unique points but I would like to work with more. Afterwards, I need to search through the list and make sure there are no duplicates near the defined precision, and then some further calculation.
Thoughts on my problem:
Naively, just write to a file and store points there. But then I need to search through the list, so perhaps I should keep the list sorted, and at that point, maybe it's just easier to use a database.
So I'm looking into how other people have handled this, maybe Mathematica handles this easily and transparently (my in-memory version, the kernel died after using all machine memory), maybe a C program interfacing to mysql is the best solution (this is where I am now, and it is slow, I am spending time on DB optimization), maybe "get a machine with more RAM" is the best solution, etc. (of course, arbitrary "best" depending on the problem).
References are preferred (I think this question is somewhat general), or even just comments pointing to the right collection of keywords to search, but I welcome any feedback for my specific problem. Thanks.
sequences-and-series reference-request
I am looking for references that discuss computer programs handling a large collection of numbers that do not fit in machine memory.
Some context:
I am working with GNU multi precision library aka gmplib (double precision was insufficient) and generating a list of numbers; a lot of the numbers being generated are duplicates (some room for improvement, but still lots of duplicates). One of the small lists has ~300,000,000 unique points but I would like to work with more. Afterwards, I need to search through the list and make sure there are no duplicates near the defined precision, and then some further calculation.
Thoughts on my problem:
Naively, just write to a file and store points there. But then I need to search through the list, so perhaps I should keep the list sorted, and at that point, maybe it's just easier to use a database.
So I'm looking into how other people have handled this, maybe Mathematica handles this easily and transparently (my in-memory version, the kernel died after using all machine memory), maybe a C program interfacing to mysql is the best solution (this is where I am now, and it is slow, I am spending time on DB optimization), maybe "get a machine with more RAM" is the best solution, etc. (of course, arbitrary "best" depending on the problem).
References are preferred (I think this question is somewhat general), or even just comments pointing to the right collection of keywords to search, but I welcome any feedback for my specific problem. Thanks.
sequences-and-series reference-request
sequences-and-series reference-request
edited Nov 16 at 19:49
asked Nov 16 at 19:39
Burnsba
412311
412311
This is a question of programming, not mathematics. You would have the same kind of problems if you were generating character strings instead of multiprecision numbers.
– Somos
Nov 16 at 20:40
@Somos Yes, I'm storing in the database as character strings. But the point of the question is to learn from others who have encountered similar challenges doing similar mathematical work. I am aware of generic data store solutions, but was hoping for something more domain specific and provided some context for that. Perhaps you have suggestions of how I can better ask this question to achieve those kinds of answers?
– Burnsba
Nov 16 at 21:11
add a comment |
This is a question of programming, not mathematics. You would have the same kind of problems if you were generating character strings instead of multiprecision numbers.
– Somos
Nov 16 at 20:40
@Somos Yes, I'm storing in the database as character strings. But the point of the question is to learn from others who have encountered similar challenges doing similar mathematical work. I am aware of generic data store solutions, but was hoping for something more domain specific and provided some context for that. Perhaps you have suggestions of how I can better ask this question to achieve those kinds of answers?
– Burnsba
Nov 16 at 21:11
This is a question of programming, not mathematics. You would have the same kind of problems if you were generating character strings instead of multiprecision numbers.
– Somos
Nov 16 at 20:40
This is a question of programming, not mathematics. You would have the same kind of problems if you were generating character strings instead of multiprecision numbers.
– Somos
Nov 16 at 20:40
@Somos Yes, I'm storing in the database as character strings. But the point of the question is to learn from others who have encountered similar challenges doing similar mathematical work. I am aware of generic data store solutions, but was hoping for something more domain specific and provided some context for that. Perhaps you have suggestions of how I can better ask this question to achieve those kinds of answers?
– Burnsba
Nov 16 at 21:11
@Somos Yes, I'm storing in the database as character strings. But the point of the question is to learn from others who have encountered similar challenges doing similar mathematical work. I am aware of generic data store solutions, but was hoping for something more domain specific and provided some context for that. Perhaps you have suggestions of how I can better ask this question to achieve those kinds of answers?
– Burnsba
Nov 16 at 21:11
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
You have too many numbers to store in RAM. Hence you have to output them (in native binary form) to disk files first. You don't want the files to be too big because searching takes more time. The ideal would be to determine intervals which contain at most a maximum count of numbers of your data.
As a first pass, write all the numbers to one big master file. Then run a scan of that file to get an idea of the distribution of the numbers. Now you can determine intervals without too many numbers in them. On a second pass, scan the master file and output to secondary files each containing only numbers in one interval. Number of intervals is numer of secondary files. From here, one by one, you can read each secondary file into memory and process it there before going on to the next file.
add a comment |
up vote
0
down vote
The problem is that there is too much work for one machine. I thought I could offload data storage, losing latency to gain space. This is basically a variety of "get a better machine" (vertical scaling). However, there seems to be a limit to what is practical to scale vertically in a kind of all-or-none way. That is, trying to use disk-based storage in place of RAM isn't really practical, and isn't really done (to my limited knowledge; of course, this is commonly handled transparently by the host operating system).
It seems the logical evolution to optimize a problem is to break the problem into smaller pieces (horizontal scaling). So I'm thinking I will have to adjust my approach to one based on parallel/distributed computing.
I will mention a few bullet points and starter references, but at this point I think there is not a "real world" on topic (for math.SE) answer like I had originally hoped for (off loading datastore for a single monolithic application). That is, there are a number of papers and even journals that cover distributed and parallel computing, but nothing specifically on solving problems in the manner like I ask about above. With that, I'm not sure if this should be closed as off topic.
Probably the most applicable aspects of a distributed computing project are checkpoints and load balancing.
Checkpointing
In addition to their primary roles in scientific
computation, cores must also perform checkpointing
by saving the exact state of the calculation to disk
periodically, because a volunteer's PC may crash or be
turned off at any time. Since work units can run for
days, checkpointing reduces the amount of work that is
lost to several minutes when the client resumes
operation. Most applications do not normally include
checkpointing, or do so without the complete state
needed, so adding this capability is the most significant
part of making code that is designed to run on a single
system run in a distributed system [1]
Some more discussion on Fault-tolerance and thread migration in [3].
Load balancing
Interestingly, executing these large, decoupled applications avoid many of the common bottlenecks of traditional parallel computing. For example, load balancing is a problem only for small problems, not large ones. With large problems the longest work unit is still only a small fraction of the time the total job requires, which leads to efficient tiling regardless of the heterogeneity in the system. [2]
References
[1] A. L. Beberg, V. S. Pande, D. L. Ensign, S. Khaliq and G. Jayachandran, "Folding@home: Lessons from eight years of volunteer distributed computing," 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Rome, 2009, pp. 1-8. url: doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5160922
[2] Andrew Chien, Brad Calder, Stephen Elbert, Karan Bhatia, "Entropia: architecture and performance of an enterprise desktop grid system," Journal of Parallel and Distributed Computing, Vol. 63, Iss. 5, 2003, pp 597-610. https://doi.org/10.1016/S0743-7315(03)00006-6
[3] Michael O. Neary, Bernd O. Christiansen, Peter Cappello, Klaus E. Schauser, "Javelin: Parallel computing on the internet," Future Generation Computer Systems, Vol. 15, Iss. 5–6, 1999, pp 659-674. https://doi.org/10.1016/S0167-739X(99)00017-5
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
You have too many numbers to store in RAM. Hence you have to output them (in native binary form) to disk files first. You don't want the files to be too big because searching takes more time. The ideal would be to determine intervals which contain at most a maximum count of numbers of your data.
As a first pass, write all the numbers to one big master file. Then run a scan of that file to get an idea of the distribution of the numbers. Now you can determine intervals without too many numbers in them. On a second pass, scan the master file and output to secondary files each containing only numbers in one interval. Number of intervals is numer of secondary files. From here, one by one, you can read each secondary file into memory and process it there before going on to the next file.
add a comment |
up vote
0
down vote
You have too many numbers to store in RAM. Hence you have to output them (in native binary form) to disk files first. You don't want the files to be too big because searching takes more time. The ideal would be to determine intervals which contain at most a maximum count of numbers of your data.
As a first pass, write all the numbers to one big master file. Then run a scan of that file to get an idea of the distribution of the numbers. Now you can determine intervals without too many numbers in them. On a second pass, scan the master file and output to secondary files each containing only numbers in one interval. Number of intervals is numer of secondary files. From here, one by one, you can read each secondary file into memory and process it there before going on to the next file.
add a comment |
up vote
0
down vote
up vote
0
down vote
You have too many numbers to store in RAM. Hence you have to output them (in native binary form) to disk files first. You don't want the files to be too big because searching takes more time. The ideal would be to determine intervals which contain at most a maximum count of numbers of your data.
As a first pass, write all the numbers to one big master file. Then run a scan of that file to get an idea of the distribution of the numbers. Now you can determine intervals without too many numbers in them. On a second pass, scan the master file and output to secondary files each containing only numbers in one interval. Number of intervals is numer of secondary files. From here, one by one, you can read each secondary file into memory and process it there before going on to the next file.
You have too many numbers to store in RAM. Hence you have to output them (in native binary form) to disk files first. You don't want the files to be too big because searching takes more time. The ideal would be to determine intervals which contain at most a maximum count of numbers of your data.
As a first pass, write all the numbers to one big master file. Then run a scan of that file to get an idea of the distribution of the numbers. Now you can determine intervals without too many numbers in them. On a second pass, scan the master file and output to secondary files each containing only numbers in one interval. Number of intervals is numer of secondary files. From here, one by one, you can read each secondary file into memory and process it there before going on to the next file.
answered Nov 16 at 21:20
Somos
12.7k11034
12.7k11034
add a comment |
add a comment |
up vote
0
down vote
The problem is that there is too much work for one machine. I thought I could offload data storage, losing latency to gain space. This is basically a variety of "get a better machine" (vertical scaling). However, there seems to be a limit to what is practical to scale vertically in a kind of all-or-none way. That is, trying to use disk-based storage in place of RAM isn't really practical, and isn't really done (to my limited knowledge; of course, this is commonly handled transparently by the host operating system).
It seems the logical evolution to optimize a problem is to break the problem into smaller pieces (horizontal scaling). So I'm thinking I will have to adjust my approach to one based on parallel/distributed computing.
I will mention a few bullet points and starter references, but at this point I think there is not a "real world" on topic (for math.SE) answer like I had originally hoped for (off loading datastore for a single monolithic application). That is, there are a number of papers and even journals that cover distributed and parallel computing, but nothing specifically on solving problems in the manner like I ask about above. With that, I'm not sure if this should be closed as off topic.
Probably the most applicable aspects of a distributed computing project are checkpoints and load balancing.
Checkpointing
In addition to their primary roles in scientific
computation, cores must also perform checkpointing
by saving the exact state of the calculation to disk
periodically, because a volunteer's PC may crash or be
turned off at any time. Since work units can run for
days, checkpointing reduces the amount of work that is
lost to several minutes when the client resumes
operation. Most applications do not normally include
checkpointing, or do so without the complete state
needed, so adding this capability is the most significant
part of making code that is designed to run on a single
system run in a distributed system [1]
Some more discussion on Fault-tolerance and thread migration in [3].
Load balancing
Interestingly, executing these large, decoupled applications avoid many of the common bottlenecks of traditional parallel computing. For example, load balancing is a problem only for small problems, not large ones. With large problems the longest work unit is still only a small fraction of the time the total job requires, which leads to efficient tiling regardless of the heterogeneity in the system. [2]
References
[1] A. L. Beberg, V. S. Pande, D. L. Ensign, S. Khaliq and G. Jayachandran, "Folding@home: Lessons from eight years of volunteer distributed computing," 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Rome, 2009, pp. 1-8. url: doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5160922
[2] Andrew Chien, Brad Calder, Stephen Elbert, Karan Bhatia, "Entropia: architecture and performance of an enterprise desktop grid system," Journal of Parallel and Distributed Computing, Vol. 63, Iss. 5, 2003, pp 597-610. https://doi.org/10.1016/S0743-7315(03)00006-6
[3] Michael O. Neary, Bernd O. Christiansen, Peter Cappello, Klaus E. Schauser, "Javelin: Parallel computing on the internet," Future Generation Computer Systems, Vol. 15, Iss. 5–6, 1999, pp 659-674. https://doi.org/10.1016/S0167-739X(99)00017-5
add a comment |
up vote
0
down vote
The problem is that there is too much work for one machine. I thought I could offload data storage, losing latency to gain space. This is basically a variety of "get a better machine" (vertical scaling). However, there seems to be a limit to what is practical to scale vertically in a kind of all-or-none way. That is, trying to use disk-based storage in place of RAM isn't really practical, and isn't really done (to my limited knowledge; of course, this is commonly handled transparently by the host operating system).
It seems the logical evolution to optimize a problem is to break the problem into smaller pieces (horizontal scaling). So I'm thinking I will have to adjust my approach to one based on parallel/distributed computing.
I will mention a few bullet points and starter references, but at this point I think there is not a "real world" on topic (for math.SE) answer like I had originally hoped for (off loading datastore for a single monolithic application). That is, there are a number of papers and even journals that cover distributed and parallel computing, but nothing specifically on solving problems in the manner like I ask about above. With that, I'm not sure if this should be closed as off topic.
Probably the most applicable aspects of a distributed computing project are checkpoints and load balancing.
Checkpointing
In addition to their primary roles in scientific
computation, cores must also perform checkpointing
by saving the exact state of the calculation to disk
periodically, because a volunteer's PC may crash or be
turned off at any time. Since work units can run for
days, checkpointing reduces the amount of work that is
lost to several minutes when the client resumes
operation. Most applications do not normally include
checkpointing, or do so without the complete state
needed, so adding this capability is the most significant
part of making code that is designed to run on a single
system run in a distributed system [1]
Some more discussion on Fault-tolerance and thread migration in [3].
Load balancing
Interestingly, executing these large, decoupled applications avoid many of the common bottlenecks of traditional parallel computing. For example, load balancing is a problem only for small problems, not large ones. With large problems the longest work unit is still only a small fraction of the time the total job requires, which leads to efficient tiling regardless of the heterogeneity in the system. [2]
References
[1] A. L. Beberg, V. S. Pande, D. L. Ensign, S. Khaliq and G. Jayachandran, "Folding@home: Lessons from eight years of volunteer distributed computing," 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Rome, 2009, pp. 1-8. url: doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5160922
[2] Andrew Chien, Brad Calder, Stephen Elbert, Karan Bhatia, "Entropia: architecture and performance of an enterprise desktop grid system," Journal of Parallel and Distributed Computing, Vol. 63, Iss. 5, 2003, pp 597-610. https://doi.org/10.1016/S0743-7315(03)00006-6
[3] Michael O. Neary, Bernd O. Christiansen, Peter Cappello, Klaus E. Schauser, "Javelin: Parallel computing on the internet," Future Generation Computer Systems, Vol. 15, Iss. 5–6, 1999, pp 659-674. https://doi.org/10.1016/S0167-739X(99)00017-5
add a comment |
up vote
0
down vote
up vote
0
down vote
The problem is that there is too much work for one machine. I thought I could offload data storage, losing latency to gain space. This is basically a variety of "get a better machine" (vertical scaling). However, there seems to be a limit to what is practical to scale vertically in a kind of all-or-none way. That is, trying to use disk-based storage in place of RAM isn't really practical, and isn't really done (to my limited knowledge; of course, this is commonly handled transparently by the host operating system).
It seems the logical evolution to optimize a problem is to break the problem into smaller pieces (horizontal scaling). So I'm thinking I will have to adjust my approach to one based on parallel/distributed computing.
I will mention a few bullet points and starter references, but at this point I think there is not a "real world" on topic (for math.SE) answer like I had originally hoped for (off loading datastore for a single monolithic application). That is, there are a number of papers and even journals that cover distributed and parallel computing, but nothing specifically on solving problems in the manner like I ask about above. With that, I'm not sure if this should be closed as off topic.
Probably the most applicable aspects of a distributed computing project are checkpoints and load balancing.
Checkpointing
In addition to their primary roles in scientific
computation, cores must also perform checkpointing
by saving the exact state of the calculation to disk
periodically, because a volunteer's PC may crash or be
turned off at any time. Since work units can run for
days, checkpointing reduces the amount of work that is
lost to several minutes when the client resumes
operation. Most applications do not normally include
checkpointing, or do so without the complete state
needed, so adding this capability is the most significant
part of making code that is designed to run on a single
system run in a distributed system [1]
Some more discussion on Fault-tolerance and thread migration in [3].
Load balancing
Interestingly, executing these large, decoupled applications avoid many of the common bottlenecks of traditional parallel computing. For example, load balancing is a problem only for small problems, not large ones. With large problems the longest work unit is still only a small fraction of the time the total job requires, which leads to efficient tiling regardless of the heterogeneity in the system. [2]
References
[1] A. L. Beberg, V. S. Pande, D. L. Ensign, S. Khaliq and G. Jayachandran, "Folding@home: Lessons from eight years of volunteer distributed computing," 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Rome, 2009, pp. 1-8. url: doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5160922
[2] Andrew Chien, Brad Calder, Stephen Elbert, Karan Bhatia, "Entropia: architecture and performance of an enterprise desktop grid system," Journal of Parallel and Distributed Computing, Vol. 63, Iss. 5, 2003, pp 597-610. https://doi.org/10.1016/S0743-7315(03)00006-6
[3] Michael O. Neary, Bernd O. Christiansen, Peter Cappello, Klaus E. Schauser, "Javelin: Parallel computing on the internet," Future Generation Computer Systems, Vol. 15, Iss. 5–6, 1999, pp 659-674. https://doi.org/10.1016/S0167-739X(99)00017-5
The problem is that there is too much work for one machine. I thought I could offload data storage, losing latency to gain space. This is basically a variety of "get a better machine" (vertical scaling). However, there seems to be a limit to what is practical to scale vertically in a kind of all-or-none way. That is, trying to use disk-based storage in place of RAM isn't really practical, and isn't really done (to my limited knowledge; of course, this is commonly handled transparently by the host operating system).
It seems the logical evolution to optimize a problem is to break the problem into smaller pieces (horizontal scaling). So I'm thinking I will have to adjust my approach to one based on parallel/distributed computing.
I will mention a few bullet points and starter references, but at this point I think there is not a "real world" on topic (for math.SE) answer like I had originally hoped for (off loading datastore for a single monolithic application). That is, there are a number of papers and even journals that cover distributed and parallel computing, but nothing specifically on solving problems in the manner like I ask about above. With that, I'm not sure if this should be closed as off topic.
Probably the most applicable aspects of a distributed computing project are checkpoints and load balancing.
Checkpointing
In addition to their primary roles in scientific
computation, cores must also perform checkpointing
by saving the exact state of the calculation to disk
periodically, because a volunteer's PC may crash or be
turned off at any time. Since work units can run for
days, checkpointing reduces the amount of work that is
lost to several minutes when the client resumes
operation. Most applications do not normally include
checkpointing, or do so without the complete state
needed, so adding this capability is the most significant
part of making code that is designed to run on a single
system run in a distributed system [1]
Some more discussion on Fault-tolerance and thread migration in [3].
Load balancing
Interestingly, executing these large, decoupled applications avoid many of the common bottlenecks of traditional parallel computing. For example, load balancing is a problem only for small problems, not large ones. With large problems the longest work unit is still only a small fraction of the time the total job requires, which leads to efficient tiling regardless of the heterogeneity in the system. [2]
References
[1] A. L. Beberg, V. S. Pande, D. L. Ensign, S. Khaliq and G. Jayachandran, "Folding@home: Lessons from eight years of volunteer distributed computing," 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Rome, 2009, pp. 1-8. url: doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5160922
[2] Andrew Chien, Brad Calder, Stephen Elbert, Karan Bhatia, "Entropia: architecture and performance of an enterprise desktop grid system," Journal of Parallel and Distributed Computing, Vol. 63, Iss. 5, 2003, pp 597-610. https://doi.org/10.1016/S0743-7315(03)00006-6
[3] Michael O. Neary, Bernd O. Christiansen, Peter Cappello, Klaus E. Schauser, "Javelin: Parallel computing on the internet," Future Generation Computer Systems, Vol. 15, Iss. 5–6, 1999, pp 659-674. https://doi.org/10.1016/S0167-739X(99)00017-5
answered Nov 19 at 17:42
Burnsba
412311
412311
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3001561%2fdealing-with-large-lists-of-numbers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
This is a question of programming, not mathematics. You would have the same kind of problems if you were generating character strings instead of multiprecision numbers.
– Somos
Nov 16 at 20:40
@Somos Yes, I'm storing in the database as character strings. But the point of the question is to learn from others who have encountered similar challenges doing similar mathematical work. I am aware of generic data store solutions, but was hoping for something more domain specific and provided some context for that. Perhaps you have suggestions of how I can better ask this question to achieve those kinds of answers?
– Burnsba
Nov 16 at 21:11