I found a bug where this is not working as desired. It seems like the case where this happens is when you paste more than 300 characters and have 2 or more line breaks. This is a very common case, for example, when you want to have a list of items split into that many number of cards:
:watermelon:Watermelon Ham, The Non-Linear Toolbox, and Your Desktop Pal - The Land of Random (substack.com) 1
https://stegriff.co.uk/upblog/web-pages-with-personality/ 1 and https://stegriff.co.uk/upblog/baby-griff/
https://tiv.today/2021/06/kinopio 2
https://design.futureland.tv/vin/futureland-design/82632?fullscreen=1
Or also:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
The current logic seems to always try to split it into only 2 cards.
1 In the beginning God created the heavens and the earth. 2 Now the earth was formless and empty, darkness was over the surface of the deep, and the Spirit of God was hovering over the waters.
3 And God said, âLet there be light,â and there was light. 4 God saw that the light was good, and he separated the light from the darkness. 5 God called the light âday,â and the darkness he called ânight.â And there was evening, and there was morningâthe first day.
This is two paragraphs. Both paragraphs individually fit into a single card. So I think the algorithm should prioritize keeping paragraphs together if possible. However, the algorithm splits this into 5 sentences, which is OK, but loses the paragraph information.
i would consider it a different interpretation, rather than a bug. Doing the ârightâ/smart thing here, may mean doing the âwrongâ thing in another case so I learn towards predictability in all cases
as a non-subject-matter expert, having 1, 2, 3, etc. as separate cards in this case seems a lot more readable though?
What would be a case where the proposed algorithm would do the âwrongâ thing?
Readability is important, but I think retaining information is more important. This is not specific to bible passages, but for any kind of writing: I have two paragraphs, and when I paste those into Kinopio, I want my paragraphs to be preserved because I treat those as units of information. The current algorithm throws that information away.
What I think is predictable is, âKeep paragraphs together as much as possible. If a paragraph is longer than 300 characters, start taking off sentences from the end until it fits.â Admittedly there are edge cases here which are ambiguous and we can talk about.
Also, what would make the current algorithm more palatable would be a way to combine cards, making it easier to put paragraphs back together. So, paint some cards, and if they all fit on one card, add a button to combine/join them together.
But I still think doing the smart/right thing is preferable What is your idea of the ârightâ thing?
I think my proposed algorithm would handle this correctly because it would first split based on new lines, which I assume separate each bullet item. Then you wouldnât need special logic to detected a numbered list either.
i have a fix for this that iâll ship near the end of the week (when iâm back from break)
it splits by paragraph, then splits a paragraph by sentences if the paragraph is too big, then splits into sentence by 150 chars if the sentence is too big.
Here is a test cases that currently fails (in my opinion):
Elegent pay duty spectacular price treat also price messy. Industry go space juicy, clean mountain the fast handling crystals. Zesty proven advertising and, aroma, rich.
The, grab easily challenge full affordable burst absorbent, terrific bigger any our buy improved sleek. Boast inside however, makes gentle double. Special, screamin' you advertising any extravaganza high.
Current behavior: option to split into 6 cards.
Desired behavior: option to split into 2 cards
Rationale: This is two paragraphs of text. Paragraphs are an atomic unit of thought. We shouldnât break them up unless they are longer than the max characters. Doing so loses information. If a paragraph is too long, then the behavior is more ambiguous, and any of these seem okay:
break it apart into sentences
break it apart into sentences, but from the last one, keeping as many sentences together as possible.
if itâs only one sentence, break it up at the nearest word boundary.