在线不卡日本ⅴ一区v二区_精品一区二区中文字幕_天堂v在线视频_亚洲五月天婷婷中文网站

<menu id="lky3g"></menu>

<pre id="lky3g"><tt id="lky3g"></tt></pre>

<label id="fp0we"></label>

<em id="fp0we"></em><pre id="fp0we"></pre>

ES 源碼分析之?dāng)?shù)據(jù)類型轉(zhuǎn)換

用戶投稿 ? 2022年6月28日 17:26 ? 社會

公司有的小伙伴問我，為什么不推薦我們使用 nested 結(jié)構(gòu)呢，還說性能低。那么，ES 針對 nested 之類的結(jié)構(gòu)。因為ES 源碼我也基本看完了。索性，直接寫成筆記。比直接在代碼里面寫注釋來的更舒心點。

1問題描述

ES 是 lucene 不僅僅是集群版的概念，還有涉及到支持豐富的數(shù)據(jù)類型。如 nested 、object 等等結(jié)構(gòu)。它是怎么支持的呢？
ES 還支持 _id、_version 等等字段。這種是怎么存儲的呢？
聽說 ES 的 parent doc 和 nested doc 是分開來存儲的，那么獲取的時候，他們是通過哪種關(guān)系關(guān)聯(lián)的呢？

2類型轉(zhuǎn)換

2.1初步代碼入口

代碼具體入口 org.elasticsearch.index.shard.IndexShard#prepareIndex

public static Engine.Index prepareIndex(DocumentMapperForType docMapper, SourceToParse source, long seqNo, long primaryTerm, long version, VersionType versionType, Engine.Operation.Origin origin, long autoGeneratedIdTimestamp, boolean isRetry, long ifSeqNo, long ifPrimaryTerm) { long startTime = System.nanoTime(); // 涉及到 nested 等等結(jié)構(gòu)的轉(zhuǎn)換，直接看【2.2 類型具體轉(zhuǎn)換代碼】 ParsedDocument doc = docMapper.getDocumentMapper().parse(source); // Mapping 是否要處理 if (docMapper.getMapping() != null) { doc.addDynamicMappingsUpdate(docMapper.getMapping()); } // _id 轉(zhuǎn) uid。這里是為了數(shù)據(jù)能保持整齊，方便壓縮?？梢詤⒖?【哈夫曼編碼】。 Term uid = new Term(IdFieldMapper.NAME, Uid.encodeId(doc.id())); return new Engine.Index(uid, doc, seqNo, primaryTerm, version, versionType, origin, startTime, autoGeneratedIdTimestamp, isRetry, ifSeqNo, ifPrimaryTerm); }

2.2類型具體轉(zhuǎn)換代碼

/** * 內(nèi)部轉(zhuǎn)換文檔，如果有 nested 結(jié)構(gòu)，需要再次轉(zhuǎn)換一下 * @param mapping * @param context * @param parser * @throws IOException */ private static void internalParseDocument(Mapping mapping, MetadataFieldMapper[] metadataFieldsMappers, ParseContext context, XContentParser parser) throws IOException { final boolean emptyDoc = isEmptyDoc(mapping, parser); /** * 預(yù)處理，為 root document 拆開，添加如下：比如，_id、_version 也是一個 document，具體看下面的【2.3 支持 _id 之類的字段】 */ for (MetadataFieldMapper metadataMapper : metadataFieldsMappers) { metadataMapper.preParse(context); } if (mapping.root.isEnabled() == false) { // entire type is disabled parser.skipChildren(); } else if (emptyDoc == false) { // 轉(zhuǎn)換對象或者 nested 結(jié)構(gòu)，這個方法會反復(fù)遞歸調(diào)用。主要是 object 結(jié)構(gòu)或者 nested 結(jié)構(gòu) parseObjectOrNested(context, mapping.root); } // 為各個非 root document 添加 _version 等等字段 for (MetadataFieldMapper metadataMapper : metadataFieldsMappers) { metadataMapper.postParse(context); } }

2.3前置處理之支持_id之類的字段

代碼位置：org.elasticsearch.index.mapper.MetadataFieldMapper#preParse下面只貼出 _id 的處理

/** * _id 也是一個 doc * @param context */ @Override public void preParse(ParseContext context) { BytesRef id = Uid.encodeId(context.sourceToParse().id()); context.doc().add(new Field(NAME, id, Defaults.FIELD_TYPE)); }

這里只是了其中的一個例子：_id ，其他的比如 _version、_seqno、_source 等等處理也類似。

2.4轉(zhuǎn)換復(fù)雜的結(jié)構(gòu)，比如nested結(jié)構(gòu)

ES 在轉(zhuǎn)換 nested 結(jié)構(gòu)的時候，比較有意思。

2.4.1類型轉(zhuǎn)換整體入口

/** * 轉(zhuǎn)換 object 或者 nested 結(jié)構(gòu)的，這里會出現(xiàn)遞歸調(diào)用，主要是為了解決 object、nested 結(jié)構(gòu) * @param context * @param mapper * @throws IOException */ static void parseObjectOrNested(ParseContext context, ObjectMapper mapper) throws IOException { if (mapper.isEnabled() == false) { context.parser().skipChildren(); return; } XContentParser parser = context.parser(); XContentParser.Token token = parser.currentToken(); if (token == XContentParser.Token.VALUE_NULL) { // the object is null (“obj1” : null), simply bail return; } String currentFieldName = parser.currentName(); if (token.isValue()) { throw new MapperParsingException(“object mapping for [” + mapper.name() + “] tried to parse field [” + currentFieldName + “] as object, but found a concrete value”); } ObjectMapper.Nested nested = mapper.nested(); // 如果是 nested 結(jié)構(gòu)，每次都會new 一個空白的 document ，而且，這個方法 #{innerParseObject}，是遞歸實現(xiàn),把 object 或者 document 變成多個 document if (nested.isNested()) { // 進(jìn)入下方的：【2.4.2 nested 轉(zhuǎn)換初步入口】 context = nestedContext(context, mapper); } // if we are at the end of the previous object, advance if (token == XContentParser.Token.END_OBJECT) { token = parser.nextToken(); } if (token == XContentParser.Token.START_OBJECT) { // if we are just starting an OBJECT, advance, this is the object we are parsing, we need the name first token = parser.nextToken(); } // 轉(zhuǎn)換對象 innerParseObject(context, mapper, parser, currentFieldName, token); // restore the enable path flag if (nested.isNested()) { nested(context, nested); } }

2.4.2nested轉(zhuǎn)換初步入口

/** * 內(nèi)部轉(zhuǎn)換 nested 結(jié)構(gòu)，生成一個空白的 nested 結(jié)構(gòu) * TODO nested 文檔的 _id 既然跟父文檔的一樣，lucene 寫入每個 doc ，都是拼接。那么，在get 的時候，自然會獲取到相同的 _id 多個文檔，包含了 nested 結(jié)構(gòu)。然后，再內(nèi)部轉(zhuǎn)換為我們最想要的結(jié)果。 * @param context * @param mapper * @return */ private static ParseContext nestedContext(ParseContext context, ObjectMapper mapper) { // 創(chuàng)建 nested 上下文，并且，new 一個空白的 document。為后面的 nested 的字段或者對象之類的，全部加上 context = context.createNestedContext(mapper.fullPath()); ParseContext.Document nestedDoc = context.doc(); ParseContext.Document parentDoc = nestedDoc.getParent(); // We need to add the uid or id to this nested Lucene document too, // If we do not do this then when a document gets deleted only the root Lucene document gets deleted and // not the nested Lucene documents! Besides the fact that we would have zombie Lucene documents, the ordering of // documents inside the Lucene index (document blocks) will be incorrect, as nested documents of different root // documents are then aligned with other root documents. This will lead tothe nested query, sorting, aggregations // and inner hits to fail or yield incorrect results. IndexableField idField = parentDoc.getField(IdFieldMapper.NAME); if (idField != null) { // We just need to store the id as indexed field, so that IndexWriter#deleteDocuments(term) can then // delete it when the root document is deleted too. nestedDoc.add(new Field(IdFieldMapper.NAME, idField.binaryValue(), IdFieldMapper.Defaults.NESTED_FIELD_TYPE)); } else { throw new IllegalStateException(“The root document of a nested document should have an _id field”); } // the type of the nested doc starts with __, so we can identify that its a nested one in filters // note, we don’t prefix it with the type of the doc since it allows us to execute a nested query // across types (for example, with similar nested objects) nestedDoc.add(new Field(TypeFieldMapper.NAME, mapper.nestedTypePathAsString(), TypeFieldMapper.Defaults.NESTED_FIELD_TYPE)); return context; }

仔細(xì)看看里面的英文。主要的一點是：nested 結(jié)構(gòu)的 _id 和 parent 的 _id 保持一致。那么，通過 GET docId 這種操作，就可以拿到所有的文檔了。而且，刪除的時候，特別的方便。算是 ES 這種的一個方案吧。

2.4.3數(shù)據(jù)處理

每個字段的填充入口在：org.elasticsearch.index.mapper.DocumentParser#innerParseObject這里是一個遞歸調(diào)用的操作。比較繞。

2.5后置處理之設(shè)置_version等等

下面貼出來 _version 的處理代碼的入口：org.elasticsearch.index.mapper.VersionFieldMapper#postParse，可以看看具體的實現(xiàn)。

@Override public void postParse(ParseContext context) { // In the case of nested docs, let’s fill nested docs with version=1 so that Lucene doesn’t write a Bitset for documents // that don’t have the field. This is consistent with the default value for efficiency. Field version = context.version(); assert version != null; for (Document doc : context.nonRootDocuments()) { // 為此 doc 添加一個 _version 字段 doc.add(version); } }

這里支持舉了 _version 舉個例子，其他類似。

3總結(jié)

ES 是 lucene 不僅僅是集群版的概念，還有涉及到支持豐富的數(shù)據(jù)類型。如 nested 、object 等等結(jié)構(gòu)。它是怎么支持的呢？答：ES 針對 nested 、object 直接拍平處理
ES 還支持 _id、_version 等等字段。這種是怎么存儲的呢？答：ES 針對 _id 、_version 是保存為獨立的文檔的。
聽說 ES 的 parent doc 和 nested doc 是分開來存儲的，那么獲取的時候，他們是通過那種關(guān)系關(guān)聯(lián)的呢？答：通過 root Doc 的 ID 來做關(guān)聯(lián)的。

4其他

后續(xù)請關(guān)注 ES 寫入流程。讓我們看看 ES 是如何處理分布式請求及保證高可用的。

鄭重聲明：本文內(nèi)容及圖片均整理自互聯(lián)網(wǎng)，不代表本站立場，版權(quán)歸原作者所有，如有侵權(quán)請聯(lián)系管理員(admin#wlmqw.com)刪除。

代碼入口字段對象數(shù)據(jù)類型文檔源碼空白類型結(jié)構(gòu)遞歸集群

CIPS上線新功能，人民幣跨境支付更便利

上一篇 2022年6月28日 17:26

沙特航天委員會與華為合作推出該國首個技術(shù)體驗中心

下一篇 2022年6月28日 17:26

園屬于什么結(jié)構(gòu)(園的結(jié)構(gòu)和部首)
園 yuán：全包圍結(jié)構(gòu)，平穩(wěn)端正中稍帶左收右展。外部“口” 體態(tài)端莊，稍抗肩，稍帶左輕右重。左豎起筆稍抖，豎身勿重，稍左斜，垂露收筆；第二筆橫折壓著左豎起筆，橫畫稍抗肩，不要重…
2022年11月24日
0
馬斯克凌晨一點半曬“代碼審查”現(xiàn)場，編排他的段子比瘋狂星期四還多
夢晨 Pine 發(fā)自凹非寺量子位 | 公眾號 QbitAI 每一個真正會寫代碼的人，請在下午2點到總部10層報到。每一個真正會寫代碼的人，請在下午2點到總部10層報到。馬斯…
2022年11月21日
0
京東店鋪類型有哪些京東入駐有什么資質(zhì)要求
今天的互聯(lián)網(wǎng)發(fā)展迅速，讓傳統(tǒng)企業(yè)有了更多選擇，但也同樣也對剛觸網(wǎng)的商家增添了許多迷茫，近日知舟電商就收到很多商家朋友詢問京東入駐相關(guān)問題，今天知舟君就給大家分享下。一．京東入駐準(zhǔn)…
2022年11月18日
0
手淘搜索是自然流量嗎(手淘搜索流量怎么提高)
作為一個賣家，我們都應(yīng)該知道，現(xiàn)在店鋪的流量大部分來自移動端，也就是我們說的手機(jī)端流量。隨著智能手機(jī)的發(fā)展及網(wǎng)速的提升，手機(jī)購物已成為常態(tài)。而淘寶也一直嘗試著從一個購物平臺往社交平…
2022年11月18日
0
怎么刪除自己的追評(淘寶追評可以刪除嗎)
一、淘寶店鋪每個評價類型的處理方案都是不同的，那具體哪些評價類型該如何區(qū)分呢？ 1、主評為好評時：不支持修改或者刪除評價的，若中評/差評改為好評，也不可修改或刪除； 2、當(dāng)主評為…
2022年11月17日
0
第35屆金雞獎獲獎名單出爐！頒獎典禮直播回放入口戳→
點擊藍(lán)字關(guān)注回復(fù)“金雞”獲取2022廈門金雞獎最新消息昨晚廈門成功舉辦了第35屆金雞獎頒獎典禮和電影節(jié)閉幕式完整獲獎名單也已出爐好奇的寶快來看看吧第35屆中國電影金…
2022年11月17日
0
2022新農(nóng)合網(wǎng)上繳費入口(新農(nóng)合醫(yī)保網(wǎng)上繳費怎么交)
目前很多地方的都已經(jīng)開通了農(nóng)村合作醫(yī)療保險自助繳費模式，可以選擇網(wǎng)上自助繳費！有些地區(qū)開通了新農(nóng)村合作醫(yī)療保險的微信公眾號，你可以先關(guān)注，然后絆定自己的社會保障卡，在微信公眾號平…
2022年11月16日
0
怎么找推廣平臺網(wǎng)絡(luò)推廣10大優(yōu)質(zhì)平臺推薦
你的關(guān)注就是對我們最大的肯定。每天一篇原創(chuàng)文章，將華銳視點十年創(chuàng)業(yè)中關(guān)于運營、程序技術(shù)方面的感悟、走過的各種坑，分享給你。希望能幫助更多創(chuàng)業(yè)者快速成長，繞過一些坑。之前也跟大家介…
2022年11月16日
0
排名前十的小說(排名前十的小說完結(jié))
本文主要講的是排名前十的小說，以及和排名前十的小說完結(jié)相關(guān)的知識，如果覺得本文對您有所幫助，不要忘了將本文分享給朋友。小說排行榜2022前十名完結(jié)（十大必看網(wǎng)絡(luò)小說排行榜每本都是…
2022年11月14日
0
網(wǎng)站客服代碼(網(wǎng)站客服代碼實現(xiàn)移動端隱藏,電腦端展開)
本文主要講的是網(wǎng)站客服代碼，以及和網(wǎng)站客服代碼實現(xiàn)移動端隱藏,電腦端展開相關(guān)的知識，如果覺得本文對您有所幫助，不要忘了將本文分享給朋友。在線客服系統(tǒng)代碼是什么？在線客服系統(tǒng)代碼…
2022年11月12日
0

聯(lián)系我們

聯(lián)系郵箱：admin#wlmqw.com
工作時間：周一至周五，10:30-18:30，節(jié)假日休息

<address id="kxxvn"><nav id="kxxvn"><strong id="kxxvn"></strong></nav></address>

<address id="kxxvn"><var id="kxxvn"><ruby id="kxxvn"></ruby></var></address>