[{"data":1,"prerenderedAt":991},["ShallowReactive",2],{"tech-file-search-knowledge-base-karpathy":3},{"id":4,"title":5,"author":6,"body":7,"category":972,"date":973,"description":974,"extension":975,"image":976,"meta":977,"navigation":117,"path":979,"readingTime":980,"seo":981,"stem":982,"tags":983,"__hash__":990},"tech\u002Ftech\u002Ffile-search-knowledge-base-karpathy.md","Bangun Knowledge Base dari Dokumen Legal — Cara Karpathy","Zainul Fanani",{"type":8,"value":9,"toc":960},"minimark",[10,14,22,29,34,41,57,60,64,217,221,224,232,238,242,248,303,306,310,317,627,630,634,637,658,661,665,840,844,900,904,907,933,937,940,947,950,953,956],[11,12,13],"p",{},"Pernah ngalamin ini? CEO nanya \"Siapa direktur perusahaan kita?\" dan kamu harus nyari jawabannya dari 26 dokumen legal yang berserakan di Google Drive. Buka satu-satu, scroll, cari nama... 10 menit kemudian baru nemu.",[11,15,16,17,21],{},"Nah, bayangin kalau jawabannya keluar dalam ",[18,19,20],"strong",{},"100 milidetik",". Tanpa buka file, tanpa scroll. Cukup tanya, langsung jawab.",[11,23,24,25,28],{},"Itu yang kita bikin hari ini: ",[18,26,27],{},"File Search Knowledge Base"," — pola yang dipopulerin Andrej Karpathy (ex-OpenAI, founder of Eureka Labs). Idenya simpel tapi powerful.",[30,31,33],"h2",{"id":32},"kenapa-karpathy-style","Kenapa Karpathy Style?",[11,35,36,37,40],{},"Andrej Karpathy punya pendekatan yang elegan buat file search. Alih-alih langsung pakai LLM buat semuanya (yang lambat dan mahal), dia split jadi ",[18,38,39],{},"dua path",":",[42,43,44,51],"ol",{},[45,46,47,50],"li",{},[18,48,49],{},"Regex path"," — buat data terstruktur (nama orang, NPWP, nomor akta). Super cepat, ~100ms.",[45,52,53,56],{},[18,54,55],{},"LLM path"," — buat query yang butuh reasoning (alamat, ringkasan, opini hukum). Lebih lambat tapi akurat, ~3-5 detik.",[11,58,59],{},"Hasilnya? 90% query terjawab lewat regex, dan LLM cuma dipanggil buat kasus yang bener-bener butuh \"otak\".",[30,61,63],{"id":62},"architecture-nya-gini","Architecture-nya Gini",[65,66,71],"pre",{"className":67,"code":68,"language":69,"meta":70,"style":70},"language-mermaid shiki shiki-themes github-light github-dark","flowchart TD\n    subgraph Sources\n        A[Cloud Storage\u003Cbr\u002F>Dokumen Legal]\n        B[Local Cache\u003Cbr\u002F>PDF + Text Files]\n        C[Wiki KB\u003Cbr\u002F>Jawaban Terakumulasi]\n    end\n\n    subgraph Processing\n        D[Query → Regex Extract\u003Cbr\u002F>Nama, NPWP, Alamat]\n        E[RAG Scoring\u003Cbr\u002F>Metadata + Full Text]\n        F[LLM Answer\u003Cbr\u002F>GPT-4 \u002F Gemini]\n    end\n\n    subgraph Output\n        G[Jawaban + Sumber]\n    end\n\n    A -->|Sync + pdftotext| B\n    B --> D\n    B --> E\n    C --> E\n    D -->|~100ms| G\n    E -->|~3-5s| F\n    F --> G\n","mermaid","",[72,73,74,82,88,94,100,106,112,119,125,131,137,143,148,153,159,165,170,175,181,187,193,199,205,211],"code",{"__ignoreMap":70},[75,76,79],"span",{"class":77,"line":78},"line",1,[75,80,81],{},"flowchart TD\n",[75,83,85],{"class":77,"line":84},2,[75,86,87],{},"    subgraph Sources\n",[75,89,91],{"class":77,"line":90},3,[75,92,93],{},"        A[Cloud Storage\u003Cbr\u002F>Dokumen Legal]\n",[75,95,97],{"class":77,"line":96},4,[75,98,99],{},"        B[Local Cache\u003Cbr\u002F>PDF + Text Files]\n",[75,101,103],{"class":77,"line":102},5,[75,104,105],{},"        C[Wiki KB\u003Cbr\u002F>Jawaban Terakumulasi]\n",[75,107,109],{"class":77,"line":108},6,[75,110,111],{},"    end\n",[75,113,115],{"class":77,"line":114},7,[75,116,118],{"emptyLinePlaceholder":117},true,"\n",[75,120,122],{"class":77,"line":121},8,[75,123,124],{},"    subgraph Processing\n",[75,126,128],{"class":77,"line":127},9,[75,129,130],{},"        D[Query → Regex Extract\u003Cbr\u002F>Nama, NPWP, Alamat]\n",[75,132,134],{"class":77,"line":133},10,[75,135,136],{},"        E[RAG Scoring\u003Cbr\u002F>Metadata + Full Text]\n",[75,138,140],{"class":77,"line":139},11,[75,141,142],{},"        F[LLM Answer\u003Cbr\u002F>GPT-4 \u002F Gemini]\n",[75,144,146],{"class":77,"line":145},12,[75,147,111],{},[75,149,151],{"class":77,"line":150},13,[75,152,118],{"emptyLinePlaceholder":117},[75,154,156],{"class":77,"line":155},14,[75,157,158],{},"    subgraph Output\n",[75,160,162],{"class":77,"line":161},15,[75,163,164],{},"        G[Jawaban + Sumber]\n",[75,166,168],{"class":77,"line":167},16,[75,169,111],{},[75,171,173],{"class":77,"line":172},17,[75,174,118],{"emptyLinePlaceholder":117},[75,176,178],{"class":77,"line":177},18,[75,179,180],{},"    A -->|Sync + pdftotext| B\n",[75,182,184],{"class":77,"line":183},19,[75,185,186],{},"    B --> D\n",[75,188,190],{"class":77,"line":189},20,[75,191,192],{},"    B --> E\n",[75,194,196],{"class":77,"line":195},21,[75,197,198],{},"    C --> E\n",[75,200,202],{"class":77,"line":201},22,[75,203,204],{},"    D -->|~100ms| G\n",[75,206,208],{"class":77,"line":207},23,[75,209,210],{},"    E -->|~3-5s| F\n",[75,212,214],{"class":77,"line":213},24,[75,215,216],{},"    F --> G\n",[30,218,220],{"id":219},"step-1-struktur-folder","Step 1 — Struktur Folder",[11,222,223],{},"Pertama, kita butuh tempat nyimpen semua dokumen yang udah di-extract jadi text:",[65,225,230],{"className":226,"code":228,"language":229},[227],"language-text","\u002Fdata\u002Flegal-kb\u002F\n├── index.json          # Metadata semua dokumen\n├── cache\u002F              # Text extraction dari PDF\n│   ├── ACME_-_Akta_Pendirian.txt\n│   └── ...\n└── wiki\u002F               # Q&A yang otomatis tersimpan\n    ├── direktur_acme.md\n    └── npwp_semua_perusahaan.md\n","text",[72,231,228],{"__ignoreMap":70},[11,233,234,237],{},[72,235,236],{},"index.json"," isinya metadata dokumen — company code, nama dokumen, tipe, dan link ke sumber aslinya.",[30,239,241],{"id":240},"step-2-download-extract-text","Step 2 — Download & Extract Text",[11,243,244,245,40],{},"Kita download PDF dari cloud storage terus extract jadi text pake ",[72,246,247],{},"pdftotext",[65,249,253],{"className":250,"code":251,"language":252,"meta":70,"style":70},"language-bash shiki shiki-themes github-light github-dark","# Download dari cloud storage\ncloud-cli download FILE_ID --output \u002Ftmp\u002Fdocument.pdf\n\n# Extract text dari PDF\npdftotext -layout \u002Ftmp\u002Fdocument.pdf \u002Ftmp\u002Fdocument.txt\n","bash",[72,254,255,261,281,285,290],{"__ignoreMap":70},[75,256,257],{"class":77,"line":78},[75,258,260],{"class":259},"sJ8bj","# Download dari cloud storage\n",[75,262,263,267,271,274,278],{"class":77,"line":84},[75,264,266],{"class":265},"sScJk","cloud-cli",[75,268,270],{"class":269},"sZZnC"," download",[75,272,273],{"class":269}," FILE_ID",[75,275,277],{"class":276},"sj4cs"," --output",[75,279,280],{"class":269}," \u002Ftmp\u002Fdocument.pdf\n",[75,282,283],{"class":77,"line":90},[75,284,118],{"emptyLinePlaceholder":117},[75,286,287],{"class":77,"line":96},[75,288,289],{"class":259},"# Extract text dari PDF\n",[75,291,292,294,297,300],{"class":77,"line":102},[75,293,247],{"class":265},[75,295,296],{"class":276}," -layout",[75,298,299],{"class":269}," \u002Ftmp\u002Fdocument.pdf",[75,301,302],{"class":269}," \u002Ftmp\u002Fdocument.txt\n",[11,304,305],{},"Jadwalin proses ini lewat cron biar otomatis sync setiap minggu.",[30,307,309],{"id":308},"step-3-regex-extraction-the-magic-trick","Step 3 — Regex Extraction (The Magic Trick)",[11,311,312,313,316],{},"Ini bagian yang paling keren. Buat query kayak \"siapa direktur?\", kita nggak perlu LLM. Regex udah cukup — dan ",[18,314,315],{},"10x lebih cepat",".",[65,318,322],{"className":319,"code":320,"language":321,"meta":70,"style":70},"language-typescript shiki shiki-themes github-light github-dark","const NOISE_WORDS = new Set([\n  'DIREKTUR', 'ADMINISTRASI', 'HUKUM', 'NOTARIS', 'PAJAK'\n]);\n\nfunction extractNames(text: string): string[] {\n  const names = new Set\u003Cstring>();\n\n  \u002F\u002F Pattern: Ms. Jane Smith, (nama diikuti koma)\n  const p1 = \u002F(?:Ms\\.|Mr\\.)\\s+([A-Z][A-Za-z.\\s]{2,35}?)(?:,|\\n)\u002Fg;\n  let m;\n  while ((m = p1.exec(text)) !== null) {\n    const clean = m[1].trim();\n    if (clean.length > 2 && isRealName(clean)) names.add(clean);\n  }\n\n  return [...names];\n}\n","typescript",[72,323,324,346,372,377,381,410,433,437,442,505,513,542,567,599,604,608,622],{"__ignoreMap":70},[75,325,326,330,333,336,339,342],{"class":77,"line":78},[75,327,329],{"class":328},"szBVR","const",[75,331,332],{"class":276}," NOISE_WORDS",[75,334,335],{"class":328}," =",[75,337,338],{"class":328}," new",[75,340,341],{"class":265}," Set",[75,343,345],{"class":344},"sVt8B","([\n",[75,347,348,351,354,357,359,362,364,367,369],{"class":77,"line":84},[75,349,350],{"class":269},"  'DIREKTUR'",[75,352,353],{"class":344},", ",[75,355,356],{"class":269},"'ADMINISTRASI'",[75,358,353],{"class":344},[75,360,361],{"class":269},"'HUKUM'",[75,363,353],{"class":344},[75,365,366],{"class":269},"'NOTARIS'",[75,368,353],{"class":344},[75,370,371],{"class":269},"'PAJAK'\n",[75,373,374],{"class":77,"line":90},[75,375,376],{"class":344},"]);\n",[75,378,379],{"class":77,"line":96},[75,380,118],{"emptyLinePlaceholder":117},[75,382,383,386,389,392,395,397,400,403,405,407],{"class":77,"line":102},[75,384,385],{"class":328},"function",[75,387,388],{"class":265}," extractNames",[75,390,391],{"class":344},"(",[75,393,229],{"class":394},"s4XuR",[75,396,40],{"class":328},[75,398,399],{"class":276}," string",[75,401,402],{"class":344},")",[75,404,40],{"class":328},[75,406,399],{"class":276},[75,408,409],{"class":344},"[] {\n",[75,411,412,415,418,420,422,424,427,430],{"class":77,"line":108},[75,413,414],{"class":328},"  const",[75,416,417],{"class":276}," names",[75,419,335],{"class":328},[75,421,338],{"class":328},[75,423,341],{"class":265},[75,425,426],{"class":344},"\u003C",[75,428,429],{"class":276},"string",[75,431,432],{"class":344},">();\n",[75,434,435],{"class":77,"line":114},[75,436,118],{"emptyLinePlaceholder":117},[75,438,439],{"class":77,"line":121},[75,440,441],{"class":259},"  \u002F\u002F Pattern: Ms. Jane Smith, (nama diikuti koma)\n",[75,443,444,446,449,451,454,458,462,465,468,470,472,475,478,480,483,486,489,491,494,496,499,502],{"class":77,"line":127},[75,445,414],{"class":328},[75,447,448],{"class":276}," p1",[75,450,335],{"class":328},[75,452,453],{"class":269}," \u002F",[75,455,457],{"class":456},"sA_wV","(?:Ms",[75,459,461],{"class":460},"snhLl","\\.",[75,463,464],{"class":328},"|",[75,466,467],{"class":456},"Mr",[75,469,461],{"class":460},[75,471,402],{"class":456},[75,473,474],{"class":276},"\\s",[75,476,477],{"class":328},"+",[75,479,391],{"class":456},[75,481,482],{"class":276},"[A-Z][A-Za-z.\\s]",[75,484,485],{"class":328},"{2,35}?",[75,487,488],{"class":456},")(?:,",[75,490,464],{"class":328},[75,492,493],{"class":276},"\\n",[75,495,402],{"class":456},[75,497,498],{"class":269},"\u002F",[75,500,501],{"class":328},"g",[75,503,504],{"class":344},";\n",[75,506,507,510],{"class":77,"line":133},[75,508,509],{"class":328},"  let",[75,511,512],{"class":344}," m;\n",[75,514,515,518,521,524,527,530,533,536,539],{"class":77,"line":139},[75,516,517],{"class":328},"  while",[75,519,520],{"class":344}," ((m ",[75,522,523],{"class":328},"=",[75,525,526],{"class":344}," p1.",[75,528,529],{"class":265},"exec",[75,531,532],{"class":344},"(text)) ",[75,534,535],{"class":328},"!==",[75,537,538],{"class":276}," null",[75,540,541],{"class":344},") {\n",[75,543,544,547,550,552,555,558,561,564],{"class":77,"line":145},[75,545,546],{"class":328},"    const",[75,548,549],{"class":276}," clean",[75,551,335],{"class":328},[75,553,554],{"class":344}," m[",[75,556,557],{"class":276},"1",[75,559,560],{"class":344},"].",[75,562,563],{"class":265},"trim",[75,565,566],{"class":344},"();\n",[75,568,569,572,575,578,581,584,587,590,593,596],{"class":77,"line":150},[75,570,571],{"class":328},"    if",[75,573,574],{"class":344}," (clean.",[75,576,577],{"class":276},"length",[75,579,580],{"class":328}," >",[75,582,583],{"class":276}," 2",[75,585,586],{"class":328}," &&",[75,588,589],{"class":265}," isRealName",[75,591,592],{"class":344},"(clean)) names.",[75,594,595],{"class":265},"add",[75,597,598],{"class":344},"(clean);\n",[75,600,601],{"class":77,"line":155},[75,602,603],{"class":344},"  }\n",[75,605,606],{"class":77,"line":161},[75,607,118],{"emptyLinePlaceholder":117},[75,609,610,613,616,619],{"class":77,"line":167},[75,611,612],{"class":328},"  return",[75,614,615],{"class":344}," [",[75,617,618],{"class":328},"...",[75,620,621],{"class":344},"names];\n",[75,623,624],{"class":77,"line":172},[75,625,626],{"class":344},"}\n",[11,628,629],{},"Kenapa regex dan bukan LLM? Karena PDF hasil OCR tuh sering banget garbled — ada null bytes, karakter aneh, format yang nggak konsisten. Regex jauh lebih robust buat handle noise kayak gini.",[30,631,633],{"id":632},"step-4-rag-scoring","Step 4 — RAG Scoring",[11,635,636],{},"Untuk query yang lebih kompleks, kita score setiap dokumen berdasarkan relevansi:",[638,639,640,646,652],"ul",{},[45,641,642,645],{},[18,643,644],{},"Metadata match"," (nama file, company code): +5 poin per kata cocok",[45,647,648,651],{},[18,649,650],{},"Full text match"," (isi dokumen): +3 poin per kata cocok",[45,653,654,657],{},[18,655,656],{},"Company code bonus",": +20 poin kalau query mention company yang bener",[11,659,660],{},"Dokumen dengan score tertinggi yang jadi konteks buat LLM.",[30,662,664],{"id":663},"step-5-hybrid-answer-assembly","Step 5 — Hybrid Answer Assembly",[65,666,668],{"className":319,"code":667,"language":321,"meta":70,"style":70},"async function answerQuery(query: string, index: KBEntry[]) {\n  \u002F\u002F 1. Coba regex dulu — kalau bisa langsung jawab, done\n  const directAnswer = tryDirectAnswer(query, index);\n  if (directAnswer) return { answer: directAnswer };\n\n  \u002F\u002F 2. Score dokumen, ambil top 5\n  const scored = scoreAndRank(query, index);\n\n  \u002F\u002F 3. Baca konteks, kirim ke LLM\n  const context = scored.slice(0, 5).map(readText).join('\\n---\\n');\n  return await callLLM(query, context);\n}\n",[72,669,670,703,708,723,737,741,746,760,764,769,823,836],{"__ignoreMap":70},[75,671,672,675,678,681,683,686,688,690,692,695,697,700],{"class":77,"line":78},[75,673,674],{"class":328},"async",[75,676,677],{"class":328}," function",[75,679,680],{"class":265}," answerQuery",[75,682,391],{"class":344},[75,684,685],{"class":394},"query",[75,687,40],{"class":328},[75,689,399],{"class":276},[75,691,353],{"class":344},[75,693,694],{"class":394},"index",[75,696,40],{"class":328},[75,698,699],{"class":265}," KBEntry",[75,701,702],{"class":344},"[]) {\n",[75,704,705],{"class":77,"line":84},[75,706,707],{"class":259},"  \u002F\u002F 1. Coba regex dulu — kalau bisa langsung jawab, done\n",[75,709,710,712,715,717,720],{"class":77,"line":90},[75,711,414],{"class":328},[75,713,714],{"class":276}," directAnswer",[75,716,335],{"class":328},[75,718,719],{"class":265}," tryDirectAnswer",[75,721,722],{"class":344},"(query, index);\n",[75,724,725,728,731,734],{"class":77,"line":96},[75,726,727],{"class":328},"  if",[75,729,730],{"class":344}," (directAnswer) ",[75,732,733],{"class":328},"return",[75,735,736],{"class":344}," { answer: directAnswer };\n",[75,738,739],{"class":77,"line":102},[75,740,118],{"emptyLinePlaceholder":117},[75,742,743],{"class":77,"line":108},[75,744,745],{"class":259},"  \u002F\u002F 2. Score dokumen, ambil top 5\n",[75,747,748,750,753,755,758],{"class":77,"line":114},[75,749,414],{"class":328},[75,751,752],{"class":276}," scored",[75,754,335],{"class":328},[75,756,757],{"class":265}," scoreAndRank",[75,759,722],{"class":344},[75,761,762],{"class":77,"line":121},[75,763,118],{"emptyLinePlaceholder":117},[75,765,766],{"class":77,"line":127},[75,767,768],{"class":259},"  \u002F\u002F 3. Baca konteks, kirim ke LLM\n",[75,770,771,773,776,778,781,784,786,789,791,794,797,800,803,806,808,811,813,816,818,820],{"class":77,"line":133},[75,772,414],{"class":328},[75,774,775],{"class":276}," context",[75,777,335],{"class":328},[75,779,780],{"class":344}," scored.",[75,782,783],{"class":265},"slice",[75,785,391],{"class":344},[75,787,788],{"class":276},"0",[75,790,353],{"class":344},[75,792,793],{"class":276},"5",[75,795,796],{"class":344},").",[75,798,799],{"class":265},"map",[75,801,802],{"class":344},"(readText).",[75,804,805],{"class":265},"join",[75,807,391],{"class":344},[75,809,810],{"class":269},"'",[75,812,493],{"class":276},[75,814,815],{"class":269},"---",[75,817,493],{"class":276},[75,819,810],{"class":269},[75,821,822],{"class":344},");\n",[75,824,825,827,830,833],{"class":77,"line":139},[75,826,612],{"class":328},[75,828,829],{"class":328}," await",[75,831,832],{"class":265}," callLLM",[75,834,835],{"class":344},"(query, context);\n",[75,837,838],{"class":77,"line":145},[75,839,626],{"class":344},[30,841,843],{"id":842},"results-nya-cakep","Results-nya Cakep",[845,846,847,863],"table",{},[848,849,850],"thead",{},[851,852,853,857,860],"tr",{},[854,855,856],"th",{},"Query",[854,858,859],{},"Method",[854,861,862],{},"Speed",[864,865,866,878,889],"tbody",{},[851,867,868,872,875],{},[869,870,871],"td",{},"\"Siapa direktur Acme Corp?\"",[869,873,874],{},"Regex",[869,876,877],{},"~150ms",[851,879,880,883,886],{},[869,881,882],{},"\"Alamat kantor Beta Inc?\"",[869,884,885],{},"RAG + LLM",[869,887,888],{},"~3s",[851,890,891,894,897],{},[869,892,893],{},"\"NPWP semua perusahaan?\"",[869,895,896],{},"Regex + Wiki",[869,898,899],{},"~200ms",[30,901,903],{"id":902},"tips-dari-pengalaman","Tips dari Pengalaman",[11,905,906],{},"Beberapa hal yang aku pelajari selama implement:",[42,908,909,915,921,927],{},[45,910,911,914],{},[18,912,913],{},"Selalu pakai full text, bukan cuma metadata"," — alamat dan nomor telepon sering muncul di isi dokumen, bukan di nama file.",[45,916,917,920],{},[18,918,919],{},"Wiki accumulation itu game-changer"," — jawaban yang udah pernah ditanyakan disimpan, jadi next time nggak perlu proses ulang.",[45,922,923,926],{},[18,924,925],{},"Cron sync setiap minggu"," — biar dokumen lokal selalu up-to-date dengan versi terbaru di cloud.",[45,928,929,932],{},[18,930,931],{},"pdftotext -layout"," lebih bagus daripada tanpa flag — preserve formatting, memudahkan regex matching.",[30,934,936],{"id":935},"kesimpulan","Kesimpulan",[11,938,939],{},"Pola Karpathy ini elegant banget: regex buat yang fast-path, LLM buat yang butuh reasoning. Nggak over-engineered, nggak under-engineered. Pas.",[11,941,942,943,946],{},"Dan yang paling penting — semuanya ",[18,944,945],{},"offline-first",". Dokumen di-cache lokal, query nggak perlu internet, response time predictable.",[11,948,949],{},"Kalau kamu punya set dokumen legal yang harus sering di-search, cobain pendekatan ini. Game-changer bener.",[951,952],"hr",{},[11,954,955],{},"Kalau tutorial ini bermanfaat, share ke teman-teman yang butuh!",[957,958,959],"style",{},"html .default .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .shiki span {color: var(--shiki-default);background: var(--shiki-default-bg);font-style: var(--shiki-default-font-style);font-weight: var(--shiki-default-font-weight);text-decoration: var(--shiki-default-text-decoration);}html .dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html.dark .shiki span {color: var(--shiki-dark);background: var(--shiki-dark-bg);font-style: var(--shiki-dark-font-style);font-weight: var(--shiki-dark-font-weight);text-decoration: var(--shiki-dark-text-decoration);}html pre.shiki code .sJ8bj, html code.shiki .sJ8bj{--shiki-default:#6A737D;--shiki-dark:#6A737D}html pre.shiki code .sScJk, html code.shiki .sScJk{--shiki-default:#6F42C1;--shiki-dark:#B392F0}html pre.shiki code .sZZnC, html code.shiki .sZZnC{--shiki-default:#032F62;--shiki-dark:#9ECBFF}html pre.shiki code .sj4cs, html code.shiki .sj4cs{--shiki-default:#005CC5;--shiki-dark:#79B8FF}html pre.shiki code .szBVR, html code.shiki .szBVR{--shiki-default:#D73A49;--shiki-dark:#F97583}html pre.shiki code .sVt8B, html code.shiki .sVt8B{--shiki-default:#24292E;--shiki-dark:#E1E4E8}html pre.shiki code .s4XuR, html code.shiki .s4XuR{--shiki-default:#E36209;--shiki-dark:#FFAB70}html pre.shiki code .sA_wV, html code.shiki .sA_wV{--shiki-default:#032F62;--shiki-dark:#DBEDFF}html pre.shiki code .snhLl, html code.shiki .snhLl{--shiki-default:#22863A;--shiki-default-font-weight:bold;--shiki-dark:#85E89D;--shiki-dark-font-weight:bold}",{"title":70,"searchDepth":84,"depth":84,"links":961},[962,963,964,965,966,967,968,969,970,971],{"id":32,"depth":84,"text":33},{"id":62,"depth":84,"text":63},{"id":219,"depth":84,"text":220},{"id":240,"depth":84,"text":241},{"id":308,"depth":84,"text":309},{"id":632,"depth":84,"text":633},{"id":663,"depth":84,"text":664},{"id":842,"depth":84,"text":843},{"id":902,"depth":84,"text":903},{"id":935,"depth":84,"text":936},"tech","2026-04-09","Tutorial lengkap bikin file search knowledge base dari PDF dokumen legal. Regex extraction + RAG scoring + LLM. Ala Andrej Karpathy.","md","\u002Fimages\u002Fposts\u002Ffile-search-kb-karpathy.jpg",{"slug":978},"file-search-knowledge-base-karpathy","\u002Ftech\u002Ffile-search-knowledge-base-karpathy",null,{"title":5,"description":974},"tech\u002Ffile-search-knowledge-base-karpathy",[984,985,986,987,988,989],"openclaw","knowledge-base","rag","karpathy","ai-assistant","legal","NQloW_Vs6uAnQMt3a-RSBcPqcKbbgCxTqlrBigxSQTA",1775747830281]