{"id":34139,"date":"2026-07-04T17:00:00","date_gmt":"2026-07-04T22:00:00","guid":{"rendered":"https:\/\/www.ecoticias.com\/en\/?p=34139"},"modified":"2026-07-04T13:07:41","modified_gmt":"2026-07-04T18:07:41","slug":"could-an-ai-model-read-a-whole-stack-of-documents-in-one-go-without-slowing-to-a-crawl-that-is-the-claim-now-drawing-attention-around-subq-a-new-large-language-model-from-the-miami-startup-subquadra","status":"publish","type":"post","link":"https:\/\/www.ecoticias.com\/en\/could-an-ai-model-read-a-whole-stack-of-documents-in-one-go-without-slowing-to-a-crawl-that-is-the-claim-now-drawing-attention-around-subq-a-new-large-language-model-from-the-miami-startup-subquadra\/34139\/","title":{"rendered":"Could an AI model read a whole stack of documents in one go without slowing to a crawl? That is the claim now drawing attention around SubQ, a new large language model from the Miami startup Subquadratic"},"content":{"rendered":"\n<p>The company says SubQ attacks one of the biggest hidden problems in modern AI, the cost of making models handle very long prompts. Independent benchmark results give the claim more weight, but the model is still not widely available. So, for now, the story is part breakthrough, part waiting game.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The bottleneck behind long AI prompts<\/h2>\n\n\n\n<p>Most large language models work by breaking text into small pieces called tokens. A token can be a word, part of a word, or even punctuation, depending on the model.<\/p>\n\n\n\n<p>The key technology behind many of today\u2019s systems is the transformer, introduced in the2017 paper &#8220;Attention Is All You Need&#8221; by researchers linked to Google. It helped models compare different parts of a sentence, paragraph, or document so they could understand context better.<\/p>\n\n\n\n<div class=\"gb-element-a00da4e5\">\n<div><div class=\"gb-looper-46613eed\">\n<div class=\"gb-loop-item gb-loop-item-a8390598 post-34150 post type-post status-publish format-standard has-post-thumbnail hentry category-environment resize-featured-image\">\n<h3 class=\"gb-text gb-text-24a51617\">Read More: <a href=\"https:\/\/www.ecoticias.com\/en\/the-fight-against-climate-change-is-entering-a-challenging-phase-it-is-no-longer-enough-to-simply-reduce-pollution-and-scientists-are-talking-about-removing-up-to-9700-million-metric-tons-of-co2-pe\/34150\/\">The fight against climate change is entering a challenging phase: it is no longer enough to simply reduce pollution, and scientists are talking about removing up to 9,700 million metric tons of CO2 per year by 2050<\/a><\/h3>\n<\/div>\n<\/div><\/div>\n<\/div>\n\n\n\n<p>The trouble starts when the text gets very long. Standard attention compares many word pieces with many other word pieces, and the amount of work grows fast. Picture every student in a packed gym having to compare notes with every other student before answering one question.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What SubQ says it changed<\/h2>\n\n\n\n<p>Subquadratic says SubQ uses a method called <a href=\"https:\/\/subq.ai\/docs\/subq-1-1-small-model-card.pdf\" target=\"_blank\" rel=\"noopener\">sparse attention<\/a>. In simple terms, the model does not compare every token with every other token. It tries to focus only on the relationships that matter most.<\/p>\n\n\n\n<p>That sounds obvious, right? But it has been hard to do without making the model worse. Alex Whedon, co-founder and chief technology officer of the startup, has argued that language is too complex for fixed shortcuts, so SubQ chooses important relationships dynamically for each input.<\/p>\n\n\n\n<p>In practical terms, that could matter for jobs that involve huge files. A lawyer might want to search <a href=\"https:\/\/www.ecoticias.com\/en\/millions-of-ai-generated-texts-are-overwhelming-courts-city-councils-and-businesses-and-the-problem-is-not-the-technology-but-the-unmanageable-volume\/28868\/\">many contracts at once<\/a>. A developer might want an AI tool to inspect an entire codebase instead of small slices of it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Benchmarks raised the stakes<\/h2>\n\n\n\n<p>The outside evaluation reported strong results on long-context retrieval, which means finding a specific fact buried inside a large amount of text. The evaluator said SubQ 1.1 Small Preview returned exact answers every time at one million and two million tokens, and reached 98 percent exact-match accuracy at six million and 12 million tokens.<\/p>\n\n\n\n<p>On LiveCodeBench, a coding benchmark that collects new programming problems over time to reduce the risk that models have already seen the answers, SubQ reached 89.7 percent pass at four attempts across more than 1,000 problems. That means the model got credit when at least one of four tries solved the task.<\/p>\n\n\n\n<p>Subquadratic also says SubQ 1.1 Small uses 64.5 times less compute than dense attention at one million tokens and runs 56 times faster than <a href=\"https:\/\/arxiv.org\/abs\/2307.08691\" target=\"_blank\" rel=\"noopener\">FlashAttention-2<\/a> on a single attention layer. Those are large claims, and they are exactly why the AI community is watching closely.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1800\" height=\"1013\" src=\"https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-model-long-context-1.jpg\" alt=\"A visualization representing the efficiency of Subquadratic\u2019s SubQ AI model, showing how it processes vast amounts of data using sparse attention.\" class=\"wp-image-34141\" title=\"\" srcset=\"https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-model-long-context-1.jpg 1800w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-model-long-context-1-300x169.jpg 300w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-model-long-context-1-768x432.jpg 768w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-model-long-context-1-1536x864.jpg 1536w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-model-long-context-1-150x84.jpg 150w\" sizes=\"auto, (max-width: 1800px) 100vw, 1800px\" \/><figcaption class=\"wp-element-caption\">Miami startup Subquadratic is drawing significant industry attention with SubQ, an AI model that claims to process massive document stacks without the traditional speed and cost bottlenecks.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Why this is not settled yet<\/h2>\n\n\n\n<p>Benchmarks are useful, but they are not the same as everyday use. The developers of RULER, a long-context testing suite from Nvidia, note that their benchmark is not comprehensive and should not replace realistic tasks.<\/p>\n\n\n\n<p>There is another catch. SubQ is still in limited access, so most outside developers cannot test it on messy real-world work. That means questions remain about reliability, edge cases, and how it behaves when documents are incomplete, duplicated, or full of contradictions.<\/p>\n\n\n\n<div class=\"gb-element-b8eaa004\">\n<div><div class=\"gb-looper-afb4d55f\">\n<div class=\"gb-loop-item gb-loop-item-f831e109 post-34124 post type-post status-publish format-standard has-post-thumbnail hentry category-environment resize-featured-image\">\n<h3 class=\"gb-text gb-text-ad4bd430\">Read More: <a href=\"https:\/\/www.ecoticias.com\/en\/the-country-with-the-largest-forest-area-in-south-america-may-face-a-momentous-decision-to-accept-large-scale-soybean-farming-and-cattle-ranching-projects-or-to-protect-the-rivers-communities-and-f\/34124\/\">The country with the largest forest area in South America may face a momentous decision: to accept large-scale soybean farming and cattle ranching projects or to protect the rivers, communities, and forests that took centuries to form<\/a><\/h3>\n<\/div>\n<\/div><\/div>\n<\/div>\n\n\n\n<p>Subquadratic also says it began with an existing open-weight frontier model, replaced dense attention with its own system, and then continued training on long books, documents, and repository-scale code. That does not erase the achievement, but it makes the exact source of the gains harder to judge.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What it could mean for AI costs<\/h2>\n\n\n\n<p>If the results hold up in wider testing, the biggest impact may be cost. Long prompts are expensive because they demand more computing power, more memory, and more time. At the end of the day, that can show up in cloud bills and, to some extent, in the electric bill behind data centers.<\/p>\n\n\n\n<p>Current AI tools often work around the problem by chopping documents into pieces, searching for likely matches, and feeding only those pieces into a model. That can be useful, but it can also miss relationships that sit far apart in a file.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Introducing SubQ - a major breakthrough in LLM intelligence.\" width=\"1200\" height=\"675\" src=\"https:\/\/www.youtube.com\/embed\/FRVuRO5ZIQI?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><figcaption class=\"wp-element-caption\">YouTube: <em>@subquadratic<\/em>.<\/figcaption><\/figure>\n\n\n\n<p>A model that can hold much more context at a lower cost could change how people use AI for research, coding, finance, and legal review. Not overnight. But it could move the field away from clever workarounds and closer to models that read the whole file.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Skepticism is still healthy<\/h2>\n\n\n\n<p>Some skepticism is justified because the launch claim was unusually bold. Dan McAteer, an AI engineer, summed up the reaction online by saying SubQ was either the biggest advance since the transformer or &#8220;the Theranos of AI.&#8221;<\/p>\n\n\n\n<p>That line spread because it captured the mood. AI has seen real breakthroughs, but it has also seen plenty of overpromising. When numbers sound this strong, independent access matters.<\/p>\n\n\n\n<div class=\"gb-element-f52fcf2c\">\n<div><div class=\"gb-looper-8ac5c208\">\n<div class=\"gb-loop-item gb-loop-item-db0ae638 post-34118 post type-post status-publish format-standard has-post-thumbnail hentry category-science resize-featured-image\">\n<h3 class=\"gb-text gb-text-331fd399\">Read More: <a href=\"https:\/\/www.ecoticias.com\/en\/seti-tracked-the-interstellar-comet-3i-atlas-for-more-than-7-hours-and-analyzed-nearly-74-million-radio-signals-the-results-point-to-something-less-spectacular-but-just-as-fascinating-a-natural-com\/34118\/\">SETI tracked the interstellar comet 3I\/ATLAS for more than 7 hours and analyzed nearly 74 million radio signals; the results point to something less spectacular, but just as fascinating: a natural comet<\/a><\/h3>\n<\/div>\n<\/div><\/div>\n<\/div>\n\n\n\n<p>Jeanine Sinanan-Singh, Appen\u2019s director of generative AI research, said the results were exciting because they appeared to validate the architecture. Still, the real test will come when many more users can try SubQ on their own workloads.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1800\" height=\"1013\" src=\"https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-long-context-breakthrough.jpg\" alt=\"A conceptual visualization of a neural network processing millions of data tokens using sparse attention mechanisms.\" class=\"wp-image-34142\" title=\"\" srcset=\"https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-long-context-breakthrough.jpg 1800w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-long-context-breakthrough-300x169.jpg 300w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-long-context-breakthrough-768x432.jpg 768w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-long-context-breakthrough-1536x864.jpg 1536w, https:\/\/www.ecoticias.com\/en\/wp-content\/uploads\/2026\/07\/subquadratic-subq-ai-long-context-breakthrough-150x84.jpg 150w\" sizes=\"auto, (max-width: 1800px) 100vw, 1800px\" \/><figcaption class=\"wp-element-caption\">Subquadratic claims its SubQ model can process 12 million tokens with a fraction of the compute cost required by traditional dense attention models.<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">What happens next<\/h2>\n\n\n\n<p>Subquadratic says SubQ is designed for coding and for searching across very large datasets. The company also says broader model releases are planned later in 2026, after work with select design partners.<\/p>\n\n\n\n<p>Can it still perform when the files are ugly, the code is old, and the question is vague? That is where the story gets interesting, because real work is rarely as clean as a benchmark.<\/p>\n\n\n\n<p>For now, the safest reading is this. SubQ has produced evidence that deserves attention, but not enough public testing to close the case.&nbsp;<\/p>\n\n\n\n<p>The main independent benchmark brief has been published by <a href=\"https:\/\/www.appen.com\/research\/subquadratic-preview-model-benchmark-evaluation\" target=\"_blank\" rel=\"noopener\"><em>Appen<\/em><\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The company says SubQ attacks one of the biggest hidden problems in modern AI, the cost of making models handle &#8230; <\/p>\n<p class=\"read-more-container\"><a title=\"Could an AI model read a whole stack of documents in one go without slowing to a crawl? That is the claim now drawing attention around SubQ, a new large language model from the Miami startup Subquadratic\" class=\"read-more button\" href=\"https:\/\/www.ecoticias.com\/en\/could-an-ai-model-read-a-whole-stack-of-documents-in-one-go-without-slowing-to-a-crawl-that-is-the-claim-now-drawing-attention-around-subq-a-new-large-language-model-from-the-miami-startup-subquadra\/34139\/#more-34139\" aria-label=\"Read more about Could an AI model read a whole stack of documents in one go without slowing to a crawl? That is the claim now drawing attention around SubQ, a new large language model from the Miami startup Subquadratic\">Read more<\/a><\/p>\n","protected":false},"author":13,"featured_media":34140,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-34139","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology","resize-featured-image"],"_links":{"self":[{"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/posts\/34139","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/comments?post=34139"}],"version-history":[{"count":1,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/posts\/34139\/revisions"}],"predecessor-version":[{"id":34143,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/posts\/34139\/revisions\/34143"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/media\/34140"}],"wp:attachment":[{"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/media?parent=34139"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/categories?post=34139"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ecoticias.com\/en\/wp-json\/wp\/v2\/tags?post=34139"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}