Commit Graph

4 Commits

Author SHA1 Message Date
s793016
b0d99f3bd3 Fix Freelinks Aho-Corasick: failure links, cache invalidation, longest-match, and Unicode safety (#9676)
* Update aho-corasick.js

 fix transition logic; ensure complete outputs (via failure-output merge); clean up stats/build scoping; clarify CJK boundary behavior.

* Update text.js

implement global longest-match priority with overlap suppression; fix refresh invalidation to ignore $:/state and drafts; handle deletions precisely to avoid rebuilding on draft deletion; add defensive check for cached automaton presence.

* Update text.js

remove comment

* Update aho-corasick.js

remove comment

* Create #9672.tid

* Create #2026-0222.tid

* Delete editions/tw5.com/tiddlers/releasenotes/5.4.0/#2026-0222.tid

* Update text.js

remove \"

* Update and rename #9672.tid to #9676.tid

change to right number

* Update #9397.tid

update the existing release note with the new PR link instead of creating a new release note.

* Delete editions/tw5.com/tiddlers/releasenotes/5.4.0/#9676.tid

update the existing release note with the new PR link instead of creating a new release note.

* Rename #9397.tid to #9676.tid

update the existing release note with the new PR link instead of creating a new release note.

* Update and rename #9676.tid to #9397.tid

add link

* Rename #9397.tid to #9676.tid

* Update tiddlywiki.info

add plugin for test build

* Update tiddlywiki.info

reverse change, ready to be merge.
2026-02-25 12:07:32 +01:00
Saq Imtiaz
785086e0a5 Fixes ESLint errors (#9668)
* fix: apply automatic eslint fixes

* lint: allow hashbang comment for tiddlywiki.js

* lint: first back of manual lint fixes for unused vars

* lint: added more fixes for unused vars

* lint: missed files

* lint: updated eslint config with selected rules from #9669
2026-02-20 08:38:42 +00:00
s793016
7c197f6ecc Fix character disappearing and false matching issues in TiddlyWiki 5.4 Freelink plugin (#9397)
* Update aho-corasick.js

False positive matches

Symptom: Words like "it is", "Choose", "Set up" are incorrectly linked to tiddler "FooBar" when a tiddler titled "xxx x FooBar" exists.

Root cause: The Aho-Corasick algorithm's output merging mechanism in buildFailureLinks caused failure link outputs to be incorrectly merged into intermediate nodes, resulting in false matches.

Fix:

Remove incorrect output merging in buildFailureLinks
Implement proper output collection during search by traversing the failure link chain
Add exact match validation: verify that the matched text exactly equals the pattern before accepting it
Add cycle detection to prevent infinite loops in failure link traversal

* Update text.js

First character disappearing

Symptom: When freelinking is enabled, the first character of matched words disappears (e.g., "The" becomes "he", "Filter" becomes "ilter").

Root cause: When the current tiddler's title was being filtered out, it was done too late in the process (during parse tree construction), causing text rendering issues.

Fix:

Move the current tiddler title filtering to the match validation stage (in processTextWithMatches)
Use substring instead of slice for better stability
Add proper case-insensitive comparison for title matching

* Update text.js

add back description

* Update aho-corasick.js

add back description

* Update tiddlywiki.info

add freelinks plugin for testing

* Update tiddlywiki.info

restore

* Update tiddlywiki.info

add freelinks plugin for test

* Update aho-corasick.js

erase comment

* Update text.js

erase comment

* Update aho-corasick.js

add back some commets

* Update aho-corasick.js

clean comment

* change note #9397

change note #9397

* Update tiddlywiki.info

reversed to original

* Update #9397.tid

update detail

* Update #9397.tid

another link added

* Update #9397.tid

add "release: 5.4.0"

* Update #9397.tid

some format modified
2025-12-17 15:02:42 +00:00
s793016
cda8d7ca8c Aho-Corasick Freelinks Enhancement for Large Wikis and Non-Latin Titles (#9084)
* Enhance Freelinks with Aho-Corasick for long titles and large wikis

Replaces regex with Aho-Corasick, adds chunking (100 titles/chunk), cache toggle, and Chinese full-width symbol support. Tested with 11,714 tiddlers.

* delete comment

* Create AhoCorasick.js

* Update text.js

* Update text.js

* Update AhoCorasick.js

* Update text.js

* Update text.js

* move AhoCorasick to AhoCorasick.js

* update AhoCorasick.js

* Delete core/modules/utils/AhoCorasick.js

wrong place

* Update text.js

* indentation modify

* remove function {}

* remove function {}

* Rename AhoCorasick.js to aho-corasick.js

correct filename

* Update tiddlywiki.info add freelink

* missing a comma here

* clean up comments & use old style

* try add it to editions/tw5.com-server for testing

* try add it to editions/prerelease for testing

* optimized

* optimized

* add setting for "Persist AhoCorasick cache"

* add  dynamic limits

* remove comment

* revert to 5f0b98d1fd

* try sort alphabet

* try sort alphabet

* try sort alphabet

* typo freelink -> freelinks

* typo freelink -> freelinks

* typo freelink -> freelinks

* Update readme.tid

* Update aho-corasick.js

Dynamically adjust limit parameters to avoid problems caused by hard-coded limits.

* Update text.js

Dynamically adjust limit parameters to avoid problems caused by hard-coded limits.

* Update tiddlywiki.info

remove other plugin for test plugin conflict

* Update tiddlywiki.info

* Update tiddlywiki.info

* Update aho-corasick.js

Description of major changes

Improve state transition logic - Ensure to go back to root node correctly in case of mismatch, and check root node for current character transition 
Fix failed link traversal - Add condition node ! == this.trie to avoid infinite loop at root node 
Enhance output collection - collect output not only from current node, but also from all nodes on failed link path, which is key to Aho-Corasick algorithm 
Add safety limit - collectCount < 10 to prevent failed link loops

Translated with DeepL.com (free version)

* Update aho-corasick.js

Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:

Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters 
At least one non-unicode character must be present before and after the match for it to be considered valid.

* Update text.js

Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:

Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters 
At least one non-unicode character must be present before and after the match for it to be considered valid.

* Update settings.tid

Word Boundary Check - The isWordBoundaryMatch function checks if the match is on a word boundary:

Alphanumeric characters [a-zA-Z0-9_] are regarded as unicode characters 
At least one non-unicode character must be present before and after the match for it to be considered valid.

* fix Word Boundary logic

* remove PersistentCache @ text.js

* remove PersistentCache @settings.tid

* Update readme.tid for Word Boundary Check

* Update aho-corasick.js Organize and delete comments

* Initial commit of freelinks plugin

* Update settings.tid Organize and delete comments

* Update tiddlywiki.info add back other plugin

* Update tiddlywiki.info alphabet sort

* Update readme.tid for new future

The plugin supports non-Western language tiddler titles (e.g., Chinese) and prioritizes longer tiddler titles for matching, ensuring accurate linking in diverse contexts. 

Furthermore, the current tiddler title within its own content is excluded from generating links to avoid self-referencing.

* Update readme.tid

* Update plugins/tiddlywiki/freelinks/text.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/aho-corasick.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/aho-corasick.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/text.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update plugins/tiddlywiki/freelinks/aho-corasick.js

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>

* Update text.js

Added locale configuration support - Added LOCALE_CONFIG_TIDDLER constant to make the sorting locale configurable instead of hardcoded "zh"
Optimized title processing - Combined the filtering and escaping logic into a single pass to reduce duplication
Added trim() for ignoreCase - Applied .trim() to the ignore case variable for consistency
Enhanced refresh logic - Added locale configuration tiddler to the refresh check
Improved comments - Added explanation for why sorting is necessary (prioritizing longer titles)

* Update text.js

we don't need to specify 'zh' at all

* Update aho-corasick.js

This single line change would add support for:

Accented letters: á, é, í, ó, ú, à, è, ì, ò, ù, ä, ë, ï, ö, ü, ñ, ç, etc.
Most Western European languages (Spanish, French, German, Italian, Portuguese, etc.)

* Update aho-corasick.js useage

* Update readme.tid for Writing Style

* Update tiddlywiki.info

revert all the changes

* Update tiddlywiki.info

revert all the changes

* Update tiddlywiki.info

revert all the changes

* Update tiddlywiki.info

revert

* Update text.js

plugins/tiddlywiki/freelinks/text.js#L25
[ESLint PR code] reported by reviewdog 🐶
Strings must use doublequote.

* Update aho-corasick.js

plugins/tiddlywiki/freelinks/aho-corasick.js#L193
[ESLint PR code] reported by reviewdog 🐶
Strings must use doublequote.

Raw Output:
{"ruleId":"@stylistic/quotes","severity":2,"message":"Strings must use doublequote.","line":193,"column":50,"nodeType":"Literal","messageId":"wrongQuotes","endLine":193,"endColumn":52,"fix":{"range":[5743,5745],"text":"\"\""}}

---------

Co-authored-by: Mario Pietsch <pmariojo@gmail.com>
2025-10-29 17:41:35 +00:00