Commit Graph

24 Commits

Author SHA1 Message Date
86be1bf614 Merge pull request #6 from jimeh/performance-improvements
feat(performance): improve core undenting performance by 20-30x
2021-02-22 22:47:02 +00:00
6a2254e918 feat(performance): improve core undenting performance by around 20-30x
Previously we relied heavily on regexp to filter out and grab all
indentation white space, and then to strip away indentation shared
across all lines. This was reasonably fast. However I wanted to see if I
could make it faster by manually iterating over the input. Turns out
doing so makes is around 20 times faster.

The code is a lot more complicated though, but I'll attempt to break it
down. There's three main phases to it:

1. Iterate over every character of the input to locate all
   line-feed (\n) characters, storing their indexes in a integer slice.
2. Iterate over the list of life-feed indexes, and for each line-feed,
   scan forward until a non-whitespace character is found, counting how
   many whitespace characters we encountered directly after the
   life-feed. If the number is lower than our previously lowest number
   of leading whitespace characters, store that as the new lowest
   number.
3. Now that we know the lowest number of leading whitespace characters
   common across every line of the input, we can iterate over the list
   of life-feed indexes again. This time to build the final output, but
   reading all characters from the life-feed index + whitespace number,
   until the next life-feed character, or end of input.

Overall this approach yields a 15-20x speed improvement over the old
method.

Benchmarks, before:

    goos: darwin
    goarch: amd64
    pkg: github.com/jimeh/undent
    cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
    BenchmarkBytes/empty-8          78280611                15.18 ns/op
    BenchmarkBytes/single-line-8     2361297               515.1 ns/op
    BenchmarkBytes/single-line_indented-8             317440              3618 ns/op
    BenchmarkBytes/multi-line-8                       630370              1920 ns/op
    BenchmarkBytes/multi-line_space_indented-8        156266              7664 ns/op
    BenchmarkBytes/multi-line_space_indented_without_any_leading_line-breaks-8                155672              8168 ns/op
    BenchmarkBytes/multi-line_space_indented_with_leading_line-breaks-8                       144655              8165 ns/op
    BenchmarkBytes/multi-line_tab_indented-8                                                  206425              5462 ns/op
    BenchmarkBytes/multi-line_tab_indented_without_any_leading_line-breaks-8                  223620              5542 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_leading_line-breaks-8                         208132              5857 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_tabs_and_spaces_after_indent-8                199480              5687 ns/op
    BenchmarkBytes/multi-line_space_indented_with_blank_lines-8                               148402              8072 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_blank_lines-8                                 200929              5691 ns/op
    BenchmarkBytes/multi-line_space_indented_with_random_indentation-8                        197412              6515 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_random_indentation-8                          281493              4272 ns/op
    BenchmarkBytes/long_block_of_text-8                                                         9894            115752 ns/op
    BenchmarkString/empty-8                                                                 100000000               12.75 ns/op
    BenchmarkString/single-line-8                                                            2224165               529.0 ns/op
    BenchmarkString/single-line_indented-8                                                    314088              3784 ns/op
    BenchmarkString/multi-line-8                                                              645804              1968 ns/op
    BenchmarkString/multi-line_space_indented-8                                               149310              8103 ns/op
    BenchmarkString/multi-line_space_indented_without_any_leading_line-breaks-8               145390              8496 ns/op
    BenchmarkString/multi-line_space_indented_with_leading_line-breaks-8                      145579              8161 ns/op
    BenchmarkString/multi-line_tab_indented-8                                                 223596              5487 ns/op
    BenchmarkString/multi-line_tab_indented_without_any_leading_line-breaks-8                 214842              5641 ns/op
    BenchmarkString/multi-line_tab_indented_with_leading_line-breaks-8                        209067              5685 ns/op
    BenchmarkString/multi-line_tab_indented_with_tabs_and_spaces_after_indent-8               210307              5584 ns/op
    BenchmarkString/multi-line_space_indented_with_blank_lines-8                              133948              9280 ns/op
    BenchmarkString/multi-line_tab_indented_with_blank_lines-8                                178296              5769 ns/op
    BenchmarkString/multi-line_space_indented_with_random_indentation-8                       206030              6222 ns/op
    BenchmarkString/multi-line_tab_indented_with_random_indentation-8                         236450              4259 ns/op
    BenchmarkString/long_block_of_text-8                                                       10000            113065 ns/op
    PASS
    ok      github.com/jimeh/undent 44.800s

Benchmarks, after:

    goos: darwin
    goarch: amd64
    pkg: github.com/jimeh/undent
    cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
    BenchmarkBytes/empty-8          596493562                2.074 ns/op
    BenchmarkBytes/single-line-8    20044598                60.64 ns/op
    BenchmarkBytes/single-line_indented-8           12449749                84.43 ns/op
    BenchmarkBytes/multi-line-8                      5086376               232.3 ns/op
    BenchmarkBytes/multi-line_space_indented-8       3077774               400.4 ns/op
    BenchmarkBytes/multi-line_space_indented_without_any_leading_line-breaks-8               3011881               386.6 ns/op
    BenchmarkBytes/multi-line_space_indented_with_leading_line-breaks-8                      3034299               402.9 ns/op
    BenchmarkBytes/multi-line_tab_indented-8                                                 4500271               266.2 ns/op
    BenchmarkBytes/multi-line_tab_indented_without_any_leading_line-breaks-8                 4355886               277.5 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_leading_line-breaks-8                        3758012               289.5 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_tabs_and_spaces_after_indent-8               4425787               271.9 ns/op
    BenchmarkBytes/multi-line_space_indented_with_blank_lines-8                              3035809               412.2 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_blank_lines-8                                3771512               334.2 ns/op
    BenchmarkBytes/multi-line_space_indented_with_random_indentation-8                       4461404               275.6 ns/op
    BenchmarkBytes/multi-line_tab_indented_with_random_indentation-8                         6960343               174.6 ns/op
    BenchmarkBytes/long_block_of_text-8                                                       315788              3776 ns/op
    BenchmarkString/empty-8                                                                 338024905                3.761 ns/op
    BenchmarkString/single-line-8                                                           20067831                59.28 ns/op
    BenchmarkString/single-line_indented-8                                                  13826002                88.16 ns/op
    BenchmarkString/multi-line-8                                                             4451938               261.6 ns/op
    BenchmarkString/multi-line_space_indented-8                                              2911797               411.1 ns/op
    BenchmarkString/multi-line_space_indented_without_any_leading_line-breaks-8              2699631               416.5 ns/op
    BenchmarkString/multi-line_space_indented_with_leading_line-breaks-8                     2737174               436.3 ns/op
    BenchmarkString/multi-line_tab_indented-8                                                4208000               304.6 ns/op
    BenchmarkString/multi-line_tab_indented_without_any_leading_line-breaks-8                4029422               295.8 ns/op
    BenchmarkString/multi-line_tab_indented_with_leading_line-breaks-8                       3929960               310.3 ns/op
    BenchmarkString/multi-line_tab_indented_with_tabs_and_spaces_after_indent-8              3978992               292.5 ns/op
    BenchmarkString/multi-line_space_indented_with_blank_lines-8                             2829766               428.5 ns/op
    BenchmarkString/multi-line_tab_indented_with_blank_lines-8                               3788185               304.8 ns/op
    BenchmarkString/multi-line_space_indented_with_random_indentation-8                      4104337               279.4 ns/op
    BenchmarkString/multi-line_tab_indented_with_random_indentation-8                        7092417               177.4 ns/op
    BenchmarkString/long_block_of_text-8                                                      283140              4398 ns/op
    PASS
    ok      github.com/jimeh/undent 47.252s
2021-02-22 22:42:27 +00:00
98946bf286 docs(readme): fix main description 2021-02-21 04:03:48 +00:00
f0855f3a83 chore(git): update "master" branch references to "main" 2021-02-21 03:45:19 +00:00
c3e2bd98b0 Merge pull request #5 from jimeh/add-print-funcs
feat(print) add Print, Printf, Fprint, and Fprintf functions
2021-02-21 00:10:54 +00:00
fe6ba9c1c4 chore(deps): update golangci-lint to v1.37.x 2021-02-20 22:09:49 +00:00
d1c5735041 chore(makefile): minor cleanup and tweaks 2021-02-20 22:09:49 +00:00
5cae4bc420 feat(print) add Print, Printf, Fprint, and Fprintf functions 2021-02-20 22:09:49 +00:00
4ded03bd72 chore(release): 1.0.2 v1.0.2 2020-12-14 14:55:43 +00:00
68a97519d5 Merge pull request #4 from jimeh/fix-bytes-method
fix(bytes): change Bytes function to accept string input but return a byte slice
2020-12-14 14:55:04 +00:00
5dbdbbf341 fix(bytes): change Bytes function to accept string input but return a byte slice
The old method signature was just nonsensical, as you would always be
providing indented values via a string literal. So it makes much more
sense to have all methods accept a string argument, and then return
different types.

This also allows use of a `Bytesf` method.

This is technically a breaking change, but I'm classifying it as a
bugfix cause the old method signature was basically useless.
2020-12-14 14:52:32 +00:00
d79e413e8e chore(release): 1.0.1 v1.0.1 2020-12-07 10:48:40 +00:00
cc372da881 Merge pull request #3 from jimeh/remove-leading-line-break-on-undented-values
fix(whitespace): remove leading line-break from input
2020-12-07 10:46:31 +00:00
b2057429a1 fix(whitespace): remove leading line-break from input
This effectively cleans up what I consider syntactical sugar required
due to Go's syntax. For example:

    str := undent.String(`
        hello
        world`,
    )

In the above example I would consider the initial line-break after the
opening back-tick (`) character syntactical sugar, and hence should be
discarded from the final undented string.

However if the literal string contains more than one initial line-break,
only the first one should be removed, as the rest would intentionally be
part of the input.
2020-12-07 10:43:26 +00:00
24e64f6c39 docs(readme): fix readme description to match godoc description 2020-11-26 12:42:41 +00:00
30dba69951 docs(readme): add Benchmarks section, update Go Reference link 2020-11-26 12:32:58 +00:00
28d6c7e8de chore(release): 1.0.0 v1.0.0 2020-11-26 04:31:56 +00:00
fa86e7c1cc chore(release): add new-version make target, add release badge to readme 2020-11-26 04:31:21 +00:00
ffa6b79166 Merge pull request #2 from jimeh/update-readme
docs(readme): add badges, Go ref, and lincense links
2020-11-26 04:13:58 +00:00
7cd7677e9b docs(readme): add badges, Go ref, and lincense links 2020-11-26 04:12:47 +00:00
4dfb369149 Merge pull request #1 from jimeh/initial-implementation
feat(undent): initial implementation
2020-11-26 04:04:55 +00:00
d481444f94 ci(github): add CI workflow for GitHub Actions 2020-11-26 04:02:22 +00:00
6cdaf8a476 feat(undent): add initial implementation of String, Stringf, and Bytes 2020-11-26 03:02:09 +00:00
7228fc7247 docs(readme): add readme 2020-11-26 00:31:01 +00:00