[metapost] reading a large array

luigi scarso luigi.scarso at gmail.com
Fri Mar 10 19:59:17 CET 2017


On Fri, Mar 10, 2017 at 7:48 PM, Qiong Cai <qiong.cai at gmail.com> wrote:

> Yes, I also figured out this way. I used the 2D array to speedup the
> reading. See the comparison below.
>
> Thanks all!
>
>
> // 1D results
>
> \begin{tabular}{ll}
>
> \toprule
>
> Number of Numbers & Reading Time \\
>
> \midrule
>
> 10 & 0.096s \\
>
> 100 & 0.096s \\
>
> 1K & 0.100s \\
>
> 10K & 0.463s \\
>
> 100K & 1m23s \\
>
> 500K & 30m21 \\
>
> \bottomrule
>
> \end{tabular}
>
> \caption{The running time of readFromFile on my iMAC (3.2GHz, SKL). The
> solution
>
> is not scaled after 10K data.}
>
>
>
> // 2D results, I used 1K for each subarray
>
> \begin{tabular}{ll}
>
> \toprule
>
> Number of Numbers & Reading Time \\
>
> \midrule
>
> 100K & 1.075s \\
>
> 500K & 5.434s \\
>
> 1M & 14s \\
>
> 5M & 6m27s \\
>
> \bottomrule
>
> \end{tabular}
>
> \label{table:fastReadFromFile}
>
> \caption{The running time of fastReadFromFile on my iMAC (3.2GHz, SKL).
> For 500K data points,
>
> it improves the performance by 335x over readFromFile. However, the
> solution is not
>
> scalable after 5M data points. I can further improve it by using
> multidimenstional arrays
>
> when my data set size reaches 5M.}
>
>
>
>

%% test-032M.mp
string foo[][][][];

def set_foo(expr c, s) =
        foo[c div 1000][c div 100][c div 10][c] := s ;
enddef ;
def get_foo(expr c) =
    foo[c div 1000][c div 100][c div 10][c]
enddef ;

string s;
numeric c ; c := 0;
forever:
    s := readfrom "test-032M.txt";
    exitif s = EOF;
    c := c + 1 ;
    set_foo(c,s) ;
endfor ;
save l ; numeric l;
for i=1 upto c :
    l := length(get_foo(c)) ;
endfor ;

end



# nl test-032M.txt |tail
287991
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287992
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287993
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287994
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287995
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287996
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287997
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287998
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
287999
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
288000
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


# \time -v mpost --numbersystem=double test-032M.mp
This is MetaPost, version 1.9991 (TeX Live 2017/dev) (kpathsea version
6.2.3/dev)


Preloading the plain mem file, version 1.005) ) (./test-032M.mp )
Transcript written on test-032M.log.
Command being timed: "mpost --numbersystem=double test-032M.mp"
User time (seconds): 5.98
System time (seconds): 0.04
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.03
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 79988
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 19378
Voluntary context switches: 175
Involuntary context switches: 9
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0




-- 
luigi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/metapost/attachments/20170310/b65baf39/attachment.html>


More information about the metapost mailing list