]> Cypherpunks repositories - gostls13.git/commit
encoding/csv: avoid allocations when reading records
authorJustin Nuß <nuss.justin@gmail.com>
Sun, 3 Jul 2016 15:49:29 +0000 (17:49 +0200)
committerBrad Fitzpatrick <bradfitz@golang.org>
Wed, 5 Oct 2016 16:57:44 +0000 (16:57 +0000)
commitbd06d4827ae637cd08f85962f996760e76e28efc
tree1872d95a6ae41319d7b00a9c6e2b0faf6cec61cd
parentdce0df29dd9052c0f00ce8217b9a51a84206e892
encoding/csv: avoid allocations when reading records

This commit changes parseRecord to allocate a single string per record,
instead of per field, by using indexes into the raw record.

Benchstat (done with f69991c17)

name                          old time/op    new time/op    delta
Read-8                          3.17µs ± 0%    2.78µs ± 1%  -12.35%  (p=0.016 n=4+5)
ReadWithFieldsPerRecord-8       3.18µs ± 1%    2.79µs ± 1%  -12.23%  (p=0.008 n=5+5)
ReadWithoutFieldsPerRecord-8    4.59µs ± 0%    2.77µs ± 0%  -39.58%  (p=0.016 n=4+5)
ReadLargeFields-8               57.0µs ± 0%    55.7µs ± 0%   -2.18%  (p=0.008 n=5+5)

name                          old alloc/op   new alloc/op   delta
Read-8                            660B ± 0%      664B ± 0%   +0.61%  (p=0.008 n=5+5)
ReadWithFieldsPerRecord-8         660B ± 0%      664B ± 0%   +0.61%  (p=0.008 n=5+5)
ReadWithoutFieldsPerRecord-8    1.14kB ± 0%    0.66kB ± 0%  -41.75%  (p=0.008 n=5+5)
ReadLargeFields-8               3.86kB ± 0%    3.94kB ± 0%   +1.86%  (p=0.008 n=5+5)

name                          old allocs/op  new allocs/op  delta
Read-8                            30.0 ± 0%      18.0 ± 0%  -40.00%  (p=0.008 n=5+5)
ReadWithFieldsPerRecord-8         30.0 ± 0%      18.0 ± 0%  -40.00%  (p=0.008 n=5+5)
ReadWithoutFieldsPerRecord-8      50.0 ± 0%      18.0 ± 0%  -64.00%  (p=0.008 n=5+5)
ReadLargeFields-8                 66.0 ± 0%      24.0 ± 0%  -63.64%  (p=0.008 n=5+5)

For a simple application that I wrote, which reads in a CSV file (via
ReadAll) and outputs the number of rows read (15857625 rows), this change
reduces the total time on my notebook from ~58 seconds to ~48 seconds.

This reduces time and allocations (bytes) each by ~6% for a real world
CSV file at work (~230000 rows, 13 colums).

Updates #16791

Change-Id: Ia07177c94624e55cdd3064a7d2751fb69322d3e4
Reviewed-on: https://go-review.googlesource.com/24723
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
src/encoding/csv/reader.go