All work
- Search does not provide accurate results and highlights for Japanese content neither on Elasticsearch nor on Solr - DMLPSA-52793Resolved issue: LPSA-52793SE Support
- Exception is thrown when UserGroup count is over Elasticsearch's 'index.max_result_window' settingLPSA-51251Resolved issue: LPSA-51251Brooke
- The color scheme of the Partial results page is not following the same pattern in all use casesLPS-144315Resolved issue: LPS-144315Milton Castro
- com.liferay.redirect.service_2.0.8 has invalid content in indexes.sql on IBM DB2LPS-143849Resolved issue: LPS-143849Victor LR
- Content Page - Non-instanceable Widgets become Instanceable after deleting itLPS-143680Resolved issue: LPS-143680PT User Page Management
- Could not view published version in Display page content fragment as a non-authorLPS-143561Resolved issue: LPS-143561Diego Hu
- It is possible to delete the auto-generated Relationship FieldLPS-143347Resolved issue: LPS-143347Evanilson Santana
- Access to the navigation menu from the inside of Style Books.LPS-143302Resolved issue: LPS-143302SE Support
- Widget object does not disappear when the widget button is disabled for the ObjectLPS-143232Resolved issue: LPS-143232Evanilson Santana
- Search is only taking into account the first wordLPS-143188Resolved issue: LPS-143188Evanilson Santana
- Content types for a Collection change when deleting all items except one in a variationLPS-143147Resolved issue: LPS-143147Diego Hu
- Objects page is not opened after upgradeLPS-143007Resolved issue: LPS-143007Rodrigo Cunha
- It is possible to set a Relationship field as mandatory making impossible to add an entry on the ObjectLPS-142905Resolved issue: LPS-142905Evanilson Santana
- The Form list ellipsis can't be used if the navigator isn't on full screenLPS-142578Resolved issue: LPS-142578SE Support
- Unable to schedule Publications when using Safari and FirefoxLPS-142374Resolved issue: LPS-142374Brian Lee
- Missing necessary metadata from portal-impl.jar / MANIFEST.MFLPS-141967Resolved issue: LPS-141967Charles Austin
- DB2 SQL Error after publishing publications with message board on DB2 11.1LPS-141945Resolved issue: LPS-141945Linda Sui
- Objects and its Fields, Relationships, Layouts names are not displayed if there is no translation for the selected Portal languageLPS-141499Resolved issue: LPS-141499Evanilson Santana
- Site initializer fails to process fragments in pages when virtual instance is created with Portal Instance APILPS-141140Resolved issue: LPS-141140PT User Page Management
- Big gap of performance when importing portlet lar between differnet OSLPS-140999Resolved issue: LPS-140999Joyce Wang
- Menu Display fragment is not visible to unauthenticated usersLPS-140988Resolved issue: LPS-140988Diego Hu
- There is no button to clear the search on Object Admin, Picklist or Object portletsLPS-140853Resolved issue: LPS-140853SE Support
- Help text for Inviter and Publisher permissions don't match their actual permissionsLPS-140610Resolved issue: LPS-140610Linda Sui
- Relationship deletion types will not work correctly if the user does not have permission to all the Objects affectedLPS-140522Resolved issue: LPS-140522Mateus Sandes Lisboa
- Permissions from Objects created on a Virtual Instance are displayed on the Main Instance and vice versaLPS-140405Resolved issue: LPS-140405Andre Farias
- Pending entries are displayed on the Item Selector of a fragment on Display PageLPS-140335Resolved issue: LPS-140335Evanilson Santana
- Shop By Diagram - Apply fixed height to Upload Image's buttonLPS-140313Resolved issue: LPS-140313Beck Liu
- It is not possible to create a Self Relationship of Many to ManyLPS-140269Resolved issue: LPS-140269Daniel Bonasser Carnib Coelho
- It is possible to take the relation from another entryLPS-140233Resolved issue: LPS-140233Rafaella Jordao
- Delete option is displayed and can be clicked on an entry with a relation (from the parent Object) for Prevent deletion typeLPS-140227Resolved issue: LPS-140227Andre Farias
- Some permissions do nothingLPS-140125Resolved issue: LPS-140125Daniel Jose Angotti
- The user is able to select an already related entry on the Relationship modalLPS-140094Resolved issue: LPS-140094Andre Farias
- Embedded search bar in content page is interactable in edit modeLPS-139470Resolved issue: LPS-139470Mateo Hermosín
- Search results are not updated when editing a PicklistLPS-139409Resolved issue: LPS-139409Thais Cabral dos Santos
- Language dropdown menu opens in wrong position when editing a picklist itemLPS-139408Resolved issue: LPS-139408Alana Ferreira
- Last digits are stored as 0 for a long type entry with 18+ digitsLPS-139015Resolved issue: LPS-139015Mateus Santana
- Fail to publish the deleted category with message inside after published the deleted message to productionLPS-138989Resolved issue: LPS-138989Linda Sui
- Not Found error appears after publishing a new siteLPS-138921Resolved issue: LPS-138921Linda Sui
- Fail to edit the document after the document type removed a field within PublicationsLPS-138759Resolved issue: LPS-138759Linda Sui
- There is a no conflicting change for document when creating with the conflicting document typeLPS-138728Resolved issue: LPS-138728Linda Sui
- There are two com.liferay.change.tracking.store.model.CTSContent conflict changesLPS-138714Resolved issue: LPS-138714Linda Sui
- Fail to publish the publication with deletion folder when the document deleted on productionLPS-138706Resolved issue: LPS-138706Linda Sui
- Publications UI gives option to edit document conflict when document doesn't exist on productionLPS-138608Resolved issue: LPS-138608Gislayne Vitorino
- Integer field is storing 0 for a long numberLPS-137866Resolved issue: LPS-137866Andre Farias
- The deleted Calendar Event Change Type doesn't display correctlyLPS-136704Resolved issue: LPS-136704Linda Sui
- Fail to revert the published publication with publisherLPS-136695Resolved issue: LPS-136695Linda Sui
- Race Condition with upgrades and PostgreSQL related to DB indexesLPS-136307Resolved issue: LPS-136307Victor LR
- After delete Synonyms, search result still display in result rankings listLPS-136234Resolved issue: LPS-136234SE Support
- Failed upgrade process for module com.liferay.dynamic.data.mapping.service with published publicationLPS-135241Resolved issue: LPS-135241Linda Sui
- Unable to extract text NPE when re-indexing after upgrade on IBM SDKLPS-134331Resolved issue: LPS-134331Victor LR
Search does not provide accurate results and highlights for Japanese content neither on Elasticsearch nor on Solr - DM
Description
Attachments
depends on
Details
Assignee
SE SupportSE SupportReporter
Tibor LipuszTibor LipuszLabels
7.2-won't7.3-known-issues7.4-known-issuesReleaseValveDXPSolr2017SEPee-tsliferay-ga1-dxp-7413liferay-ga10-ce-743liferay-ga11-ce-743liferay-ga12-ce-743liferay-ga13-ce-743-known-issueliferay-ga14-ce-743-known-issuesliferay-ga15-ce-743-known-issuesliferay-ga16-ce-743-known-issuesliferay-ga17-ce-743-known-issuesliferay-ga18-ce-743-known-issuesliferay-ga19-ce-743-known-issuesliferay-ga2-ce-741liferay-ga20-ce-743-known-issuesliferay-ga21-ce-743-known-issuesliferay-ga22-ce-743-known-issuesliferay-ga23-ce-743-known-issuesliferay-ga24-ce-743-known-issuesliferay-ga25-ce-743-known-issuesliferay-ga26-ce-743-known-issuesliferay-ga27-ce-743-known-issuesliferay-ga4-ce-733liferay-ga4-ce-743liferay-ga5-ce-734liferay-ga5-ce-743liferay-ga6-ce-735liferay-ga6-ce-743liferay-ga7-ce-736liferay-ga7-ce-743liferay-ga8-ce-737liferay-ga8-ce-743liferay-ga9-ce-743liferay-u1-dxp-7413liferay-u2-dxp-7413lima-board-outreviewed-by-pmEpic/Theme
Fix Priority
4Development End Date
Jun 01, 2022, 4:46 AMComponents
Affects versions
Priority
Medium
Details
Details
Assignee
Reporter
Labels
Epic/Theme
Fix Priority
Development End Date
Components
Affects versions
Priority
Zendesk Support
Zendesk Support
Zendesk Support
Activity
Yasuyuki TakeoNovember 7, 2017 at 5:47 PMEdited
, ,
Is there way to separate Analyzer / Tokenizer each for indexing and searching? It may also affect the result as well...
Yasu
Yasuyuki TakeoNovember 2, 2017 at 8:02 AM
I just tested b,c and f. I found out these 3 cases aren't a bug, the design of Kuromoji.
I checked how Elasticsearch returns response for each query, Liferay actually reflect the correct data that Elasticsearch returns.
The reason why the behavior is how Kuromoji's designed is because the morphological engine tokenizes words based on certain algorithms and embedded dictionary.
As developers working on tuning Japanese search, usually have to tune boost, adding certain words into the dictionary or using multiple index fields and combine them for meeting requirements for each case at projects.
In terms of case b,
the words "あいうえお" means kind of "abcde" in English, which doesn't mean nothing. So the result, they are tokenized "あい" and "うえ"、which make sense based on Japanese. If you really want to highlight the whole "あいうえお" as one word, then you may want to add the "あいうえお" as a noun.
In terms of case c,
I looked into analyzer as
then only あい and お are tokenized.(Kuromoji generalize Kanji and HIragana all to Katakana, so the アイ and あい、オ and お are same.
but あいう is tokenized as
So it doesn't match. This also makes sense.
In terms of case f,
推進部 is tokenized as follows
Again, the Kanji are translated into Katakana. it looks like 推進(スイシン) and 部(ブ) are tokenized as each words. So it makes sense that only 部 also hits.
Yasuyuki TakeoOctober 31, 2017 at 5:25 PM
and are currently working on manual testing with new mappings according to 's request.
Yasuyuki TakeoOctober 11, 2017 at 8:41 AM
,
I just created a pull request to , https://github.com/BryanEngler/liferay-portal/pull/220.
Please let me know if something missing.
Jordi RodóSeptember 13, 2017 at 11:12 PM
Hey Tibor,
Just playing with ES & kuromoji may reveal that the problem of "incorrect" highlighting is caused by how text gets tokenized based on the current mapping and settings:
Looks like morphological analysis is being performed. Did the same in Solr/Kuromoji and looks like we get the same results.
(see attached aiueo.png, "text" and "baseForm" rows).
Which makes me wonder if we should check whether Liferay is displaying and highlighting as specified by the search engine only and leave the accuracy to search engine tuning.
Regards,
Jordi
Similar for WCM: See .
Steps to Reproduce - master/7.0.x
Start Liferay
Set Japenese as the portal's default language: Control Panel - Configuration - Instance Settings - Misc - Default Language
Upload
to Documents and Media
Upload
to Documents and Media with the following metadata:
Title: サンプルB
Description: これは東京都品川区で登録したファイルです
Add Language Selector portlet to the default page
Switch to Japanese locale
Searching & Results
a. Searched for the string English Japanese
SEARCH RESULT: PASS - Document3.txt is displayed
HIGHLIGHT: PASS - Both English and Japanese are highlighted as expected in Document3.txt.
b. Searched for the string あいうえお 日本語 (aiueo nihongo)
SEARCH RESULT: PASS - Both Document1.txt and Document2.txt are available in the search results.
HIGHLIGHT: FAIL - Partially working. 日本語 was highlighted as expected, but あいえうお is NOT highlighted. Strangely, only あい is highlighted.
c. Searching for partial strings such as あいう (aiu)
SEARCH RESULT: FAIL - No search results are present, even though it is expected that あいう is present in Document1.txt
HIGHLIGHT: FAIL - Since there are no search results, nothing can be highlighted.
d. Search for サンプル (sampuru)
SEARCH RESULT: PASS - Document サンプルB is visible.
HIGHLIGHT: PASS - Only text that says
サンプル
is highlighted.e. Search for 推進 (suishin)
SEARCH RESULT: PASS - Document is visible
HIGHLIGHT: PASS - The string is highlighted in the DM result.
f. Search for 推進部 (suishinbu)
SEARCH RESULT: PASS - Document is visible
HIGHLIGHT: FAIL - The string is highlighted in the DM result, but strangely enough, the last character of the string 部 is highlighted also.
g. Search for 品川区 (shinagawaku)
SEARCH RESULT: PASS - DM result is found.
HIGHLIGHT: FAIL - The document contains this string in the actual content, but because the description is the only thing that's displayed, no highlights are present.
Reproduced on master@b7df384c4f71832b3afe5f67d65d3a641d1bbee3
Reproduced with Remote Elasticsearch 2.4.x
Reproduced with Solr: https://dev.liferay.com/discover/deployment/-/knowledge_base/7-0/using-solr - Tested with Liferay Solr 5 Search Engine 1.0.0
It does not work either if you change the assigned analyzer to
text_ja
for fieldscontent, description, subtitle, title
in schema.xml and hit a reindex either.Most probably it affects other assets as well