Thursday, April 3, 2008

Multi Threading:

Multithreading is a general purpose programming technique that reduces the complexity and overhead of concurrent programming. Multithreading is the process by which identical processes run in as many threads as needed, accessing the same file in any mode.
One can simulate this multithreading concept in reading the same file and processing records from it based on an arbitrary set of criteria in a program running in many threads. For a file having "n" records which is split across "m" jobs the processing time drops by about n/m time fraction.
Processing time would depend on many factors (such as number of processors, number of other jobs etc.) and we cannot say that processing time would reduce to n/m times. All we can say is that processing time would reduce considerably.
The idea is to split the number of records in the file by "m", a predefined number that indicates the number of threads that need to be run that can be decided based on the size of the file. It decides the number of records for each thread and gets the relative record number slot for each thread needed. Say there are 100 records and we have 5 threads. Then thread 1 gets rrn 1 to 20, thread 2 gets 21 to 40 and so on. The last job gets the full rrn slot needed for it or a slot smaller than that if the value of the number of records / number of jobs is not a round number.
In a nutshell, we run multiple jobs in parallel in such a way that each job is allocated a unique set of records for processing as opposed to having one job processing all the records.

No comments: