extr 5 hours ago

What is with the negativity in these comments? This is a huge, huge surface area that touches a large percentage of white collar work. Even just basic automation/scaffolding of spreadsheets would be a big productivity boost for many employees.

My wife works in insurance operations - everyone she manages from the top down lives in Excel. For line employees a large percentage of their job is something like "Look at this internal system, export the data to excel, combine it with some other internal system, do some basic interpretation, verify it, make a recommendation". Computer Use + Excel Use isn't there yet...but these jobs are going to be the first on the chopping block as these integrations mature. No offense to these people but Sonnet 4.5 is already at the level where it would be able to replicate or beat the level of analysis they typically provide.

  • Scubabear68 an hour ago

    Having wrangled many spreadsheets personally, and worked with CFOs who use them to run small-ish businesses, and all the way up to one of top 3 brokerage houses world-wide using them to model complex fixed income instruments... this is a disaster waiting to happen.

    Spreadsheet UI is already a nightmare. The formula editing and relationship visioning is not there at all. Mistakes are rampant in spreadsheets, even my own carefully curated ones.

    Claude is not going to improve this. It is going to make it far, far worse with subtle and not so subtle hallucinations happening left and right.

    The key is really this - all LLMs that I know of rely on entropy and randomness to emulate human creativity. This works pretty well for pretty pictures and creating fan fiction or emulating someone's voice.

    It is not a basis for getting correct spreadsheets that show what you want to show. I don't want my spreadsheet correctness to start from a random seed. I want it to spring from first principles.

    • noosphr 38 minutes ago

      My first job out of uni was building a spreadsheet infra as code version control system after a Windows update made an eight year old spreadsheet go haywire and lose $10m in a afternoon.

      Spreadsheets are already a disaster.

    • MattGaiser 8 minutes ago

      > Mistakes are rampant in spreadsheets

      To me, the case for LLMs is strongest not because LLMs are so unusually accurate and awesome, but because if human performance were put on trial in aggregate, it would be found wanting.

      Humans already do a mediocre job of spreadsheets, so I don't think it is a given that Claude will make more mistakes than humans do.

  • atleastoptimal 10 minutes ago

    HN has a base of strong anti-AI bias, I assume is partially motivated by insecurity over being replaced, losing their jobs or having missed the boat on the AI.

    • extr 5 minutes ago

      Based on the comments here, it's surprisingly anything in society works at all. I didn't realize the bar was "everything perfect every time, perfectly flexible and adaptable". What a joy some of these folks must be to work with, answering every new technology with endless reasons why it's worthless and will never work.

    • MattGaiser a minute ago

      HN has an obsession with quality too, which has merit, but is often economically irrelevant.

      When US-East-1 failed, lots of people talked about how the lesson was cloud agnosticism and multi cloud architecture. The practical economic lesson for most is that if US-East-1 fails, nobody will get mad at you. Cloud failure is viewed as an act of god.

  • cube00 5 hours ago

    I don't trust LLMs to do the kind of precise deterministic work you need in a spreadsheet.

    It's one thing to fudge the language in a report summary, it can be subjective, however numbers are not subjective. It's widely known LLMs are terrible at even basic maths.

    Even Google's own AI summary admits it which I was surprised at, marketing won't be happy.

    Yes, it is true that LLMs are often bad at math because they don't "understand" it as a logical system but rather process it as text, relying on pattern recognition from their training data.

    • extr 4 hours ago

      Seems like you're very confused about what this work typically entails. The job of these employees is not mental arithmatic. It's closer to:

      - Log in to the internal system that handles customer policies

      - Find all policies that were bound in the last 30 days

      - Log in to the internal system that manages customer payments

      - Verify that for all policies bound, there exists a corresponding payment that roughly matches the premium.

      - Flag any divergences above X% for accounting/finance to follow up on.

      Practically this involves munging a few CSVs, maybe typing in a few things, setting up some XLOOKUPs, IF formulas, conditional formatting, etc.

      Will AI replace the entire job? No...but that's not the goal. Does it have to be perfect? Also no...the existing employees performing this work are also not perfect, and in fact sometimes their accuracy is quite poor.

      • AvAn12 an hour ago

        > “Does it have to be perfect?”

        Actually, yes. This kind of management reporting is either (1) going to end up in the books and records of the company - big trouble if things have to be restated in the future or (2) support important decisions by leadership — who will be very much less than happy if analysis turns out to have been wrong.

        A lot of what ties up the time of business analysts is ticking and tying everything to ensure that mistakes are not made and that analytics and interpretations are consistent from one period to the next. The math and queries are simple - the details and correctness are hard.

        • extr 9 minutes ago

          Speak for yourself and your own use cases. There are a huge diversity of workflows with which to apply automation in any medium to large business. They all have differing needs. Many excel workflows I'm personally familiar with already incoporate a "human review" step. Telling a business leader that they can now jump straight to that step, even if it requires 2x human review, with AI doing all of the most tediuous and low-stakes prework, is a clear win.

        • 2b3a51 an hour ago

          There is another aspect to this kind of activity.

          Sometimes there can be an advantage in leading or lagging some aspects of internal accounting data for a time period. Basically sitting on credits or debits to some accounts for a period of weeks. The tacit knowledge to know when to sit on a transaction and when to action it is generally not written down in formal terms.

          I'm not sure how these shenanigans will translate into an ai driven system.

          • AvAn12 26 minutes ago

            That’s the kind of thing that can get a company into a lot of trouble with its auditors and shareholders. Not that I am offering accounting advice of course. And yeah, one can not “blame” and ai system or try to ai-wash any dodgy practices.

      • Ntrails 3 hours ago

        Checking someone elses spreadsheet is a fucking nightmare. If your company has extremely good standards it's less miserable because at least the formatting etc will be consistent...

        The one thing LLMs should consistently do is ensure that formatting is correct. Which will help greatly in the checking process. But no, I generally don't trust them to do sensible things with basic formulation. Not a week ago GPT 5 got confused whether a plus or a minus was necessary in a basic question of "I'm 323 days old, when is my birthday?"

        • xmprt 3 hours ago

          I think you have a misunderstanding of the types of things that LLMs are good at. Yes you're 100% right that they can't do math. Yet they're quite proficient at basic coding. Most Excel work is similar to basic coding so I think this is an area where they might actually be pretty well suited.

          My concern would be more with how to check the work (ie, make sure that the formulas are correct and no columns are missed) because Excel hides all that. Unlike code, there's no easy way to generate the diff of a spreadsheet or rely on Git history. But that's different from the concerns that you have.

          • Wowfunhappy 17 minutes ago

            > Yes you're 100% right that they can't do math.

            The model ought to be calling out to some sort of tool to do the math—effectively writing code, which it can do. I'm surprised the major LLM frontends aren't always doing this by now.

          • mapt 15 minutes ago

            So do it in basic code where numbering your line G53 instead of G$53 doesn't crash a mass transit network because somebody's algorithm forgot to order enough fuel this month.

          • collingreen 3 hours ago

            I've built spreadsheet diff tools on Google sheets multiple times. As the needs grows I think we will see diffs and commits and review tools reach customers

            • break_the_bank 2 hours ago

              hey Collin! I am working on an AI agent on Google Sheets, I am curious if any of your designs are out in the public. We are trying to re-think how diffs should look like and want to make something nicer than what we currently have, so curious.

          • alfalfasprout 30 minutes ago

            proficient != near-flawless.

            > Most Excel work is similar to basic coding so I think this is an area where they might actually be pretty well suited.

            This is a hot take. One I'm not sure many would agree with.

        • koliber 2 hours ago

          Maybe LLMs will enable a new type of work in spreadsheets. Just like in coding we have PR reviews, with an LLM it should be possible to do a spreadsheet review. Ask the LLM to try to understand the intent and point out places where the spreadsheet deviates from the intent. Also ask the LLM to narrate the spreadsheet so it can be understood.

          • Insanity 2 hours ago

            That first condition "try to understand the intent" is where it could go wrong. Maybe it thinks the spreadsheet aligns with the intent, but it misunderstood the intent.

            LLMs are a lossy validation, and while they work sometimes, when they fail they usually do so 'silently'.

            • monkeydust 30 minutes ago

              Maybe we need some kind of method, framework to develop intent. Most of things that go wrong in knowledge working are down to lack of common understanding of intent.

        • runarberg 2 hours ago

          > The one thing LLMs should consistently do is ensure that formatting is correct.

          In JavaScript (and I assume most other programming languages) this is the job of static analysis tools (like eslint, prettier, typescript, etc.). I’m not aware of any LLM based tools which performs static analysis with as good a results as the traditional tools. Is static analysis not a thing in the spreadsheet world? Are there the tools which do static analysis on spreadsheets subpar, or offer some disadvantage not seen in other programming languages? And if so, are LLMs any better?

          • eric-burel an hour ago

            Just use a normal static analysis tool and shove the result to an LLM. I believe Anthropic properly figured that agents are the key, in addition to models, contrary to OpenAI that is run by a psycho that only believes in training the bigger model.

      • dpoloncsak 3 hours ago

        Sysadmin of a small company. I get asked pretty often to help with a pivot table, vlookup, or just general excel functions (and smartsheet, these users LOVE smartsheet)

        • toomuchtodo 2 hours ago

          Indeed, in a small enough org, the sysadmin/technologist becomes support of last resort for all the things.

        • JumpCrisscross 2 hours ago

          > these users LOVE smartsheet

          I hate smartsheet…

          Excel or R. (Or more often, regex followed by pen and paper followed by more regex.)

      • lossolo 3 hours ago

        Last time, I gave claude an invoice and asked it to change one item on it, it did so nicely and gave me the new invoice. Good thing I noticed it had also changed the bank account number..

        The more complicated the spreadsheet and the more dependencies it has, the greater the room for error. These are probabilistic machines. You can use them, I use them all the time for different things, but you need to treat them like employees you can't even trust to copy a bank account number correctly.

        • mikeyouse 3 hours ago

          We’ve tried to gently use them to automate some of our report generation and PDF->Invoice workflows and it’s a nightmare of silent changes and absence of logic.. basic things like specifically telling it “debits need to match credits” and “balance sheets need to balance” that are ignored.

        • wholinator2 an hour ago

          Yeah, asking llm to edit one specific thing in a large or complex document/ codebase is like those repeated "give me the exact same image" gifs. It's fundamentally a statistical model so the only thing we can be _certain_ of is that _it's not_. It might get the desired change 100% correct but it's only gonna get the entire document 99 5%

      • next_xibalba an hour ago

        The use cases for spreadsheets are much more diverse than that. In my experience, spreadsheets just as often used for calculation. Many of them do require high accuracy, rely on determinism, and necessitate the understanding of maths ranging from basic arithmetic to statistics and engineering formulas. Financial models, for example, must be built up from ground truth and need to always use the right formulas with the right inputs to generate meaningful outputs.

        I have personally worked with spreadsheet based financial models that use 100k+ rows x dozens of columns and involve 1000s of formulas that transform those data into the desired outputs. There was very little tolerance for mistakes.

        That said, humans, working in these use cases, make mistakes >0% of the time. The question I often have with the incorporation of AI into human workflows is, will we eventually come to accept a certain level of error from them in the way we do for humans?

    • brookst 5 minutes ago

      Do you trust humans to be precise and deterministic, or even to be especially good at math?

      This is talking about applying LLMs to formula creation and references, which they are actually pretty good at. Definitely not about replacing the spreadsheet's calculation engine.

    • Kiro 24 minutes ago

      Most real-world spreadsheets I've worked with were fragile and sloppy, not precise and deterministic. Programmers always get shocked when they realize how many important things are built on extremely messy spreadsheets, and that people simply accept it. They rather just spend human hours correcting discrepancies than trying to build something maintainable.

    • MangoCoffee 8 minutes ago

      LLMs are just a tool, though. Humans still have to verify them, like with very other tools out there

      • A4ET8a8uTh0_v2 6 minutes ago

        Eh, yes. In theory. In practice, and this is what I have experienced personally, bosses seem to think that you now have interns so you should be able to do 5x the output.. guess what that means. No verification or rubber stamp.

    • laweijfmvo 4 hours ago

      I don't trust humans to do the kind of precise deterministic work you need in a spreadsheet!

      • baconbrand 4 hours ago

        Right, we shouldn’t use humans or LLMs. We should use regular deterministic computer programs.

        For cases where that is not available, we should use a human and never an LLM.

        • davidpolberger an hour ago

          I like to use Claude Code to write deterministic computer programs for me, which then perform the actual work. It saves a lot of time.

          I had a big backlog of "nice to have scripts" I wanted to write for years, but couldn't find the time and energy for. A couple of months after I started using Claude Code, most of them exist.

          • baconbrand 41 minutes ago

            That’s great and the only legitimate use case here. I suspect Microsoft will not try to limit customers to just writing scripts and will instead allow and perhaps even encourage them to let the AI go ham on a bunch of raw data with no intermediary code that could be reviewed.

            Just a suspicion.

        • extr 3 hours ago

          "regular deterministic computer programs" - otherwise known as the SUM function in Microsoft Excel

    • game_the0ry an hour ago

      > I don't trust LLMs to do the kind of precise deterministic work you need in a spreadsheet.

      I was thinking along the same lines, but I could not articulate as well as you did.

      Spreadsheet work is deterministic; LLM output is probabilistic. The two should be distinguished.

      Still, its a productivity boost, which is always good.

    • mbreese an hour ago

      I don’t see the issue so much as the deterministic precision of an LLM, but the lack of observability of spreadsheets. Just looking at two different spreadsheets, it’s impossible to see what changes were made. It’s not like programming where you can run a `git diff` to see what changes an LLM agent made to a source code file. Or even a word processing document where the text changes are clear.

      Spreadsheets work because the user sees the results of complex interconnected values and calculations. For the user, that complexity is hidden away and left in the background. The user just sees the results.

      This would be a nightmare for most users to validate what changes an LLM made to a spreadsheet. There could be fundamental changes to a formula that could easily be hidden.

      For me, that the concern with spreadsheets and LLMs - which is just as much a concern with spreadsheets themselves. Try collaborating with someone on a spreadsheet for modeling and you’ll know how frustrating it can be to try and figure out what changes were made.

    • bg24 3 hours ago

      "I don't trust LLMs to do the kind of precise deterministic work" => I think LLM is not doing the precise arithmetic. It is the agent with lots of knowledge (skills) and tools. Precise deterministic work is done by tools (deterministic code). Skills brings domain knowledge and how to sequence a task. Agent executes it. LLM predicts the next token.

    • doug_durham 4 hours ago

      Sure, but this isn't requiring that the LLM do any math. The LLM is writing formulas and code to do the math. They are very good at that. And like any automated system you need to review the work.

      • causal 3 hours ago

        Exactly, and if it can be done in a way that helps users better understand their own spreadsheets (which are often extremely complex codebases in a single file!) then this could be a huge use case for Claude.

    • sdeframond an hour ago

      > I don't trust LLMs to do the kind of precise deterministic work you need in a spreadsheet.

      Rightly so! But LLMs can still make you faster. Just don't expect too much from it.

    • chpatrick 2 hours ago

      They're not great at arithmetic but at abstract mathematics and numerical coding they're pretty good actually.

    • mhh__ 2 hours ago

      If LLMs can replace mathematica for me when I'm doing affine yield curve calculations they can do a DCF for some banker idiots

    • informal007 an hour ago

      you might trust when the precision is extremely high and others agree with that.

      high precision is possible because they can realize that by multiple cross validations

    • prisonguard an hour ago

      ChatGPT is actively being used as a calculator.

    • zarmin 3 hours ago

      >I don't trust LLMs to do the kind of precise deterministic work

      not just in a spreadsheet, any kind of deterministic work at all.

      find me a reliable way around this. i don't think there is one. mcp/functions are a band aid and not consistent enough when precision is important.

      after almost three years of using LLMs, i have not found a single case where i didn't have to review its output, which takes as long or longer than doing it by hand.

      ML/AI is not my domain, so my knowledge is not deep nor technical. this is just my experience. do we need a new architecture to solve these problems?

      • baconbrand 2 hours ago

        ML/AI is not my domain but you don’t have to get all that technical to understand that LLMs run on probability. We need a new architecture to solve these problems.

    • mrcwinn 4 hours ago

      I couldn’t agree more. I get all my perfectly deterministic work output from human beings!

      • goatlover 4 hours ago

        If only we had created some device that could perform deterministic calculations and then wrote software that made it easy for humans to use such calculations.

        • bryanrasmussen 3 hours ago

          ok but humans are idiots, if only we could make some sort of Alternate Idiot, a non-human but every bit as generally stupid as humans are! This A.I would be able to do every stupid thing humans did with the device that performed deterministic calculations only many times faster!

          • baconbrand 2 hours ago

            Yes and when the AI did that all the stupid humans could accept its output without question. This would save the humans a lot of work and thought and personal responsibility for any mistakes! See also Israel’s Lavender for an exciting example of this in action.

  • A4ET8a8uTh0_v2 8 minutes ago

    It is bad in a very specific sense, but I did not see any other comments express the bad parts instead of focusing merely on the accuracy part ( which is an issue, but not the issue ):

    - this opens up ridiculous flood of data that would otherwise be semi-private to one company providing this service - this works well small data sets, but will choke on ones it will need to divvy up into chunks inviting interesting ( and yet unknown ) errors

    There is a real benefit to being able to 'talk to data', but anyone who has seen corporate culture up close and personal knows exactly where it will end.

    edit: an i saying all this as as person, who actually likes llms.

  • mapt 24 minutes ago

    The vast majority of people in business and science are using spreadsheets for complex algorithmic things they weren't really designed for, and we find a metric fuckton of errors in the sheets when you actually bother looking auditing them, mistakes which are not at all obvious without troubleshooting by... manually checking each and every cell & cell relation, peering through parentheses, following references. It's a nightmare to troubleshoot.

    LLMs specialize in making up plausible things with a minimum of human effort, but their downside is that they're very good at making up plausible things which are covertly erroneous. It's a nightmare to troubleshoot.

    There is already an abject inability to provision the labor to verify Excel reasoning when it's composed by humans.

    I'm dead certain that Claude will be able to produce plausibly correct spreadsheets. How important is accuracy to you? How life-critical is the end result? What are your odds, with the current auditing workflow?

    Okay! Now! Half of the users just got laid off because management thinks Claude is Good Enough. How about now?

    • practice9 18 minutes ago

      LLMs are getting quite good at reviewing the results and implementations, though

  • pluc 33 minutes ago

    Anthropic now has all your company's data, and all you saved was the cost of one human minus however much they charge for this. The good news is it can't have your data again! So starting from the 163rd-165th person you fire, you start to see a good return and all you've sacrificed is exactitude, precision, judgement, customer service and a little bit of public perception!

  • pavel_lishin 4 hours ago

    My concern is that my insurance company will reject a claim, or worse, because of something an LLM did to a spreadsheet.

    Now, granted, that can also happen because Alex fat-fingered something in a cell, but that's something that's much easier to track down and reverse.

    • manquer 3 hours ago

      They already doing that with AI, rejecting claims at higher numbers than before .

      Privatized insurance will always find a way to pay out less if they could get away with it . It is just nature of having the trifecta of profit motive , socialized risk and light regulation .

      • smithkl42 2 hours ago

        If you think that insurance companies have "light regulation", I shudder to think of what "heavy regulation" would look like. (Source: I'm the CTO at an insurance company.)

        • manquer an hour ago

          Light did not mean to imply quantity of paperwork you have to do, rather are you allowed to do the things you want to do as a company.

          More compliance or reporting requirements usually tend to favor the larger existing players who can afford to do it and that is also used to make the life difficult and reject more claims for the end user.

          It is kind of thing that keeps you and me busy, major investors don't care about it all, the cost of the compliance or the lack is not more than a rounding number in the balance, the fines or penalties are puny and laughable.

          The enormous profits year on year for decades now, the amount of consolidation allowed in the industry show that the industry is able to do mostly what they want pretty much, that is what I meant by light regulation.

        • lotsofpulp an hour ago

          They have too much regulation, and too little auditing (at least in the managed healthcare business).

          • nxobject 9 minutes ago

            I agree, and I can see where it comes from (at least at the state level). The cycle is: bad trend happens that has deep root causes (let's say PE buying rural hospitals because of reduced Medicaid/Medicare reimbursements); legislators (rightfully) say "this shouldn't happen", but don't have the ability to address the deep root causes so they simply regulate healthcare M&As – now you have a bandaid on a problem that's going to pop up elsewhere.

            • lotsofpulp 5 minutes ago

              I mean even in the simple stuff like denying payment for healthcare that should have been covered. CMS will come by and out a handful of cases, out of millions, every few years.

              So obviously the company that prioritizes accuracy of coverage decisions by spending money on extra labor to audit itself is wasting money. Which means insureds have to waste more time getting the payment for healthcare they need.

      • JumpCrisscross 2 hours ago

        > They already doing that with AI, rejecting claims at higher numbers than before

        Source?

        • nartho 39 minutes ago

          Haven't risk based models been a thing for the last 15-20 years ?

      • philipallstar 2 hours ago

        > It is just nature of having the trifecta of profit motive , socialized risk and light regulation.

        It's the nature of everything. They agree to pay you for something. It's nothing specific to "profit motive" in the sense you mean it.

        • manquer an hour ago

          I should have been clearer - profit maximization above all else as long it is mostly legal. Neither profit or profit maximization at all cost is nature of everything .

          There are many other entity types from unions[1], cooperatives , public sector companies , quasi government entities, PBC, non profits that all offer insurance and can occasionally do it well.

          We even have some in the US and don’t think it is communism even - like the FDIC or things like social security/ unemployment insurance.

          At some level government and taxation itself is nothing but insurance ? We agree to paying taxes to mitigate against variety of risks including foreign invasion or smaller things like getting robbed on the street.

          [1] Historically worker collectives or unions self-organized to socialize the risks of both major work ending injuries or death.

          Ancient to modern armies operate on because of this insurance the two ingredients that made them not mercenaries - a form of long term insurance benefit (education, pension, land etc) or family members in the event of death and sovereign immunity for their actions.

      • jimbokun 2 hours ago

        Couldn't they accomplish the same thing by rejecting a certain percentage of claims totally at random?

        • manquer an hour ago

          That would be illegal though, the goal is do this legally after all.

          We also have to remember all claims aren't equal. i.e. some claims end up being way costlier than others. You can achieve similar % margin outcomes by putting a ton of friction like, preconditions, multiple appeals processes and prior authorization for prior authorization, reviews by administrative doctors who have no expertise in the field being reviewed don't have to disclose their identity and so and on.

          While U.S. system is most extreme or evolved, it is not unique, it is what you get when you end up privatize insurance any country with private insurance has some lighter version of this and is on the same journey .

          Not that public health system or insurance a la NHS in UK or like Germany work, they are underfunded, mismanaged with long times in months to see a specialist and so on.

          We have to choose our poison - unless you are rich of course, then the U.S. system is by far the best, people travel to the U.S. to get the kind of care that is not possible anywhere else.

          • jimbokun 18 minutes ago

            Why does saying "AI did it" make it legal, if the outcome is the same?

      • keernan 2 hours ago

        >>They already doing that with AI, rejecting claims at higher numbers than before .

        That's a feature, not a bug.

        • elpakal an hour ago

          This is a great application of this quote. Insurance providers have 0 incentive to make their AI "good" at processing claims, in fact it's easy to see how "bad" AI can lead to a justification to deny more claims.

    • wombatpm an hour ago

      Wait until a company has to restate earnings because of a bug in a Claudified Excel spreadsheet.

  • threetonesun an hour ago

    Probably because many people here are software developers, and wrapping spreadsheets in deterministic logic and a consistent UI covers... most software use cases.

  • gadders 4 hours ago

    Yeah, this could be a pretty big deal. Not everyone is an excel expert, but nearly everyone finds themselves having to work with data in excel at some time or other.

  • lacker 40 minutes ago

    It's like the negativity whenever a post talks about hiring or firing. A lot of people are afraid that they are going to lose their jobs to AI.

  • hbarka 3 hours ago

    What does scaffolding of spreadsheets mean? I see the term scaffolding frequently in the context of AI-related articles and not familiar with this method and I’m hesitant to ask an LLM.

    • Rudybega 3 hours ago

      Scaffolding typically just refers to a larger state machine style control flow governing an agent's behavior and the suite of external tools it has access to.

  • BuildItBusk 2 hours ago

    I have to admit that my first thought was “April’s fool”. But you are right. It makes a lot of sense (if they can get it to work well). Not only is Excel the world’s biggest “programming language”. It’s probably also one of the most unintuitive ways to program.

    • baq an hour ago

      If you exclude macros with IO it’s actually the most popular purely functional programming language (no quotes) on the planet by far.

  • protonbob 2 hours ago

    > but these jobs are going to be the first on the chopping block as these integrations mature.

    Perhaps this is part of the negativity? This is a bad thing for the middle class.

    • jpadkins an hour ago

      in the short run. In the long run, productivity gains benefit* all of us (in a functional market economy).

      *material benefit. In terms of spirit and purpose, the older I get the more I think maybe the Amish are on to something. Work gives our lives purpose, and the closer the work is to our core needs, the better it feels. Labor saving so that most of us are just entertaining each other on social networks may lead to a worse society (but hey, our material needs are met!)

    • informal007 an hour ago

      agree with you, but it cannot be stopped. development of technology always makes wealth distribution more centralized

  • informal007 an hour ago

    this will push the development of open source models.

    people think of privacy at first regards of data, local deployment of open source models are the first choice for them

  • tokai 2 hours ago

    Whats with claiming negativity when most of the comments here are positive?

  • Workaccount2 an hour ago

    I think excel is a dead end. LLM agents will probably greatly prefer SQL, sqlite, and Python instead of bulky made-for-regular-folks excel.

    Versatility and efficiency explode while human usability tanks, but who cares at that point?

    • informal007 an hour ago

      Database might be the future, but viable solution on excel are evidence to prove that it works

  • intended 4 hours ago

    I used to live in excel.

    The issue isn’t in creating a new monstrosity in excel.

    The issue is the poor SoB who has to spelunk through the damn thing to figure out what it does.

    Excel is the sweet spot of just enough to be useful, capable enough to be extensible, yet gated enough to ensure everyone doesn’t auto run foreign macros (or whatever horror is more appropriate).

    In the simplest terms - it’s not excel, it’s the business logic. If an excel file works, it’s because theres someone who “gets” it in the firm.

    • extr 4 hours ago

      I used to live in Excel too. I've trudged through plenty of awful worksheets. The output I've seen from AI is actually more neatly organized than most of what I used to receive in outlook. Most of that wasn't hyper-sophisticated cap table analyses. It was analysis from a Jr Analyst or line employee trying to combine a few different data sources to get some signal on how XYZ function of the business was performing. AI automation is perfectly suitable for this.

      • intended 3 hours ago

        How?

        Neat formatting didn't save any model from having the wrong formula pasted in.

        Being neat was never a substitute for being well rested, or sufficiently caffeinated.

        Have you seen how AI functions in the hands of someone who isn't a domain expert? I've used it for things I had no idea about, like Astro+ web dev. User ignorance was magnified spectacularly.

        This is going to have Jr Analysts dumping well formatted junk in email boxes within a month.

  • gedy 4 hours ago

    It's actually really cool. I will say that "spreadsheets" remain a bandaid over dysfunctional UIs, processes, etc and engineering spends a lot of time enabling these bandaids vs someone just saying "I need to see number X" and not "a BI analytics data in a realtime spreadsheet!", etc.

  • doctorpangloss 4 hours ago

    > What is with the negativity in these comments?

    Some people - normal people - understand the difference between the holistic experience of a mathematically informed opinion and an actual model.

    It's just that normal people always wanted the holistic experience of an answer. Hardly anyone wants a right answer. They have an answer in their heads, and they want a defensible journey to that answer. That is the purpose of Excel in 95% of places it is used.

    Lately people have been calling this "syncophancy." This was always the problem. Sycophancy is the product.

    Claude Excel is leaning deeply into this garbage.

    • extr 4 hours ago

      It seems like to me the answer is moreso "People on HN are so far removed from the real use cases for this kind of automation they simply have no idea what they're talking about".

      • genrader 3 hours ago

        This is so correct it hurts

  • behnamoh 3 hours ago

    > How teams use Claude for Excel

    Who are these teams that can get value from Anthropic? One MCP and my context window is used up and Claude tells me to start a new chat.

causal 3 hours ago

Seems everyone is speculating features instead of just reading TFA which does in fact list features:

- Get answers about any cell in seconds: Navigate complex models instantly. Ask Claude about specific formulas, entire worksheets, or calculation flows across tabs. Every explanation includes cell-level citations so you can verify the logic.

- Test scenarios without breaking formulas: Update assumptions across your entire model while preserving all dependencies. Test different scenarios quickly—Claude highlights every change with explanations for full transparency.

- Debug and fix errors: Trace #REF!, #VALUE!, and circular reference errors to their source in seconds. Claude explains what went wrong and how to fix it without disrupting the rest of your model.

- Build models or fill existing templates: Create draft financial models from scratch based on your requirements. Or populate existing templates with fresh data while maintaining all formulas and structure.

  • Balgair an hour ago

    If this can reliably deal with the REF, VALUE, and NA problems, it'll be worth it for that alone.

    Oh and deal with dates before 1900.

    Excel is a gift from God if you stay in its lane. If you ever so slightly deviate, not even the Devil can help you.

    But maybe, juuuuust maybe, AI can?

    • libraryatnight 30 minutes ago

      "not even Devil can help you.

      But maybe, juuuuust maybe, AI can?"

      Bold assumption that the devil and AI aren't aligned ;)

  • beefnugs 18 minutes ago

    Also people complaining about AI inaccuracy are just technical people that like precision. The vast majority of the world is people who dont give a damn about accuracy or even correctness. They just want to appear as if not completely useless to people that could potentially affect their salary

Havoc 2 hours ago

They can try, but doubt anyone serious will adopt it.

Tried integrating chatgpt into my finance job to see how far I can get. Mega jikes...millions of dollars of hallucinated mistakes.

Worse you don't have the same tight feedback loop you've got in programming that'll tell you when something is wrong. Compile errors, unit tests etc. You basically need to walk through everything it did to figure out what's real and what's hallucinations. Basically fails silently. If they roll that out at scale in the financial system...interesting times ahead.

Still presumably there is something around spreadsheets it'll be able to do - the spreadsheet equivalent of boilerplate code whatever that may be

  • AppleBananaPie 38 minutes ago

    I'm bad with spread sheets so maybe this is trivial but having an llm tell me how to connect my sheet to whatever data I'm using at the moment and it coming up with a link or sql query or both has allowed me to quickly pull in data where I'd normally eyeball it and move on or worst case do it partially manually if really important.

    It's like one off scripts in a sense? I'm not doing complex formulas I just need to know how I can pull data into a sheet and then I'll bucketize or graph it myself.

    Again probably because I'm not the most adept user but it has definitely been a positive use case for me.

    I suspect my use case is pretty boilerplatey :)

btown 43 minutes ago

From the signup form mentioning Private Equity / Venture Capital, Hedge Fund, Investment Banking... this seems squarely aimed at financial modeling. Which is really, really cool.

I've worked alongside sell-side investment bankers in a prior startup, and so much of the work is in taking a messy set of statements from a company, understanding the underlying assumptions, and building, and rebuilding, and rebuilding, 3-statement models that not only adhere to standard conventions (perhaps best introed by https://www.wallstreetprep.com/knowledge/build-integrated-3-... ) but also are highly customized for different assumptions that can range from seasonality to sensitivity to creative deal structures.

It is quite common for people to pull many, many all-nighters to try to tweak these models in response to a senior banker or a client having an idea! And one might argue there are way too many similar-looking numbers to keep a human banker from "hallucinating," much less an LLM.

But fundamentally, a 3-statement model and all its build-sheets are a dependency graph with loosely connected human-readable labels, and that means you can write tools that let an LLM crawl that dependency graph in a reliable and semantically meaningful way. And that lets you build really cool things, really fast.

I'm of the opinion that giving small companies the ability to present their finances to investors, the same way Fortune 500 companies hire armies of bankers to do, is vital to a healthy economy, and to giving Main Street the best possible chance to succeed and grow. This is a massive step in the right direction.

  • JonChesterfield 40 minutes ago

    Presenting your finances to investors via a tool designed for generation of plausible looking data is fraud.

    • ceh123 32 minutes ago

      Presenting false data to investors is fraud, doesn't matter how it was generated. In fact, humans are quite good at "generating plausible looking data", doesn't mean human generated spreadsheets are fraud.

      On the other hand, presenting truthful data to investors is distinctly not fraud, and this again does not depend on the generation method.

      • alfalfasprout 28 minutes ago

        If humans "generate plausible looking data" despite any processes to ensure data quality they've likely engaged in willful fraud.

        An LLM doing so needn't even be willful from the author's part. We're going to see issues with forecasts/slide decks full of inaccuracies that are hard to review.

    • Kydlaw 27 minutes ago

      You might have accidentally described what accounting is.

serf 2 hours ago

Anthropic is in a weird place for me right now. They're growing fast , creating little projects that i'd love to try, but their customer service was so bad for me as a max subscriber that I set an ethical boundary for myself to avoid their services until such point that it appears that they care about their customers whatsoever.

I keep searching for a sign, but everyone I talk to has horror stories. It sucks as a technologist that just wants to play with the thing; oh well.

  • consumer451 an hour ago

    > I keep searching for a sign, but everyone I talk to has horror stories. It sucks as a technologist that just wants to play with the thing; oh well.

    The reason that Claude Code doesn't have an IDE is because ~"we think the IDE will obsolete in a year, so it seemed like a waste of time to create one."

    Noam Shazeer said on a Dwarkesh podcast that he stopped cleaning his garage, because a robot will be able to do it very soon.

    If you are operating under the beliefs these folks have, then things like IDEs, cleaning up, and customer service are stupid annoyances that will become obsolete very soon.

    To be clear, I have huge respect for everyone mentioned above, especially Noam.

    • chairmansteve 39 minutes ago

      "Noam Shazeer said on a Dwarkesh podcast that he stopped cleaning his garage, because a robot will be able to do it very soon".

      How much is the robot going to cost in a year? 100k? 200k? Not mass market pricing for sure.

      Meanwhile, today he could pay someone $1000 to clean his garage.

      • consumer451 24 minutes ago

        I would do it for free, just to answer the question of what does a genius of his caliber have in his garage? Probably the same stuff most people do, but it would still be interesting.

        I don’t think the point was about having a clean space, it was in response to a question along the lines of: when do you think we will achieve AGI?

  • informal007 39 minutes ago

    bad customer service comes from low priority. I think anthropic prioritize new growth point over small number of customer’s feedback, that’s why they publish new product, features so frequently, there are so much possible potential opportunities for them to focus

  • redhale an hour ago

    What happened? I'm a Max subscriber and I'd like to know what to look out for!

  • cmrdporcupine an hour ago

    Best way to think of it is this: Right now you are not the customer. Investors are.

    The money people pay in monthly fees to Anthropic for even the top Max sub likely doesn't come closer to covering the energy & infrastructure costs for running the system.

    You can prove this to yourself by just trying to cost out what it takes to build the hardware capable of running a model of this size at this speed and running it locally. It's tens of thousands of dollars just to build the hardware, not even considering the energy bills.

    So I imagine the goal right now is to pull in a mass audience and prove the model, to get people hooked, to get management and talent at software firms pushing these tools.

    And I guess there's some in management and the investment community that thinks this will come with huge labour cost reductions but I think they may be dreaming.

    ... And then.. I guess... jack the price up? Or wait for Moore's Law?

    So it's not a surprise to me they're not jumping to try and service individual subscribers who are paying probably a fraction of what it costs them to the run the service.

    I dunno, I got sick of paying the price for Max and I now use the Claude Code tool but redirect it to DeepSeek's API and use their (inferior but still tolerable) model via API. It's probably 1/4 the cost for about 3/4 the product. It's actually amazing how much of the intelligence is built into the tool itself instead of just the model. It's often incredibly hard to tell the difference bertween DeepSeek output and what I got from Sonnet 4 or Sonnet 4.5

    • Wowfunhappy 24 minutes ago

      I've been playing around with local LLMs in Ollama, just for fun. I have an RTX 4080 Super, a Ryzen 5950X with 32 threads, and 64 GB of system memory. A very good computer, but decidedly consumer-level hardware.

      I have primarily been using the 120b gpt-oss model. It's definitely worse than Claude and GPT-5, but not by, like, an order of magnitude or anything. It's also clearly better than ChatGPT was when it first came out. Text generates a bit slowly, but it's perfectly usable.

      So it doesn't seem so unreasonable to me that costs could come down in a few years?

    • kridsdale1 an hour ago

      You are bang on.

      Every AI company right now (except Google Meta and Microsoft) has their valuations based on the expectation of a future monopoly on AGI. None of their business models today or in the foreseeable horizon are even positive let alone world-dominating. The continued funding rounds are all apparently based on expectation of becoming the sole player.

      The continuing advancement of open source / open weights models keeps me from being a believer.

      I’ve placed my bet and feel secure where it is.

kaspermarstal 43 minutes ago

So cool, I hope they pull it off. So many people use Excel. Although, I always thought the power of AI in Excel would come from the ability to use AI _as_ a formula. For example, =PROMPT("Classify user feedback as positive, neutral or negative", A1). This would enable normal people (non-programmers) to fire off thousands of prompts at once and automate workflows like programmers do (disclaimer: I am the author of Cellm that does exactly this). Combined with Excel's built-in functions for deterministic work, Claude could really kill the whole copy-pasting data in and out of chat windows for bulk-processing data.

davidpolberger 2 hours ago

I'm a co-founder of Calcapp, an app builder for formula-driven apps using Excel-like formulas. I spent a couple of days using Claude Code to build 20 new templates for us, and I was blown away. It was able to one-shot most apps, generating competent, intricate apps from having looked at a sample JSON file I put together. I briefly told it about extensions we had made to Excel functions (including lambdas for FILTER, named sort type enums for XMATCH, etc), and it picked those up immediately.

At one point, it generated a verbose formula and mentioned, off-handedly, that it would have been prettier had Calcapp supported LET. "It does!", I replied, "and as an extension, you can use := instead of , to separate names and values!") and it promptly rewrote it using our extended syntax, producing a sleek formula.

These templates were for various verticals, like real estate, financial planning and retail, and I would have been hard-pressed to produce them without Claude's domain knowledge. And I did it in a weekend! Well, "we" did it in a weekend.

So this development doesn't really surprise me. I'm sure that Claude will be right at home in Excel, and I have already thought about how great it would be if Claude Code found a permanent home in our app designer. I'm concerned about the cost, though, so I'm holding off for now. But it does seem unfair that I get to use Claude to write apps with Calcapp, while our customers don't get that privilege.

(I wrote more about integrating Claude Code here: https://news.ycombinator.com/item?id=45662229)

NumberCruncher 13 minutes ago

On the first glance this seems to be a very bad idea. But re-readig this:

> Get answers about any cell in seconds: Navigate complex models instantly. Ask Claude about specific formulas, entire worksheets, or calculation flows across tabs. Every explanation includes cell-level citations so you can verify the logic.

this might just be an excellent tool for refactoring Excel sheets into something more robust and maintainable. And making a bunch of suits redundant.

martinald 4 hours ago

This is going to be massive if it works as well as I suspect it might.

I think many software engineers overlook how many companies have huge (billion dollar) processes run through Excel.

It's much less about 'greenfield' new excel sheets and much more about fixing/improving existing ones. If it works as well as Claude Code works for code, then it will get pretty crazy adoption I suspect (unless Microsoft beats them to it).

  • lm28469 3 hours ago

    > I think many software engineers overlook how many companies have huge (billion dollar) processes run through Excel.

    So they can fire the two dudes that take care of it, lose 15 years of in house knowledge to save 200k a year and cry in a few months when their magic tool shits the bed ?

    Massive win indeed

    • bsenftner 2 hours ago

      If the company is half baked, those "two dudes" will become indispensable beyond belief. They are the ones that understand how Excel works far deeper, and paired with Claude for Excel they become far far more valuable.

      • Balgair an hour ago

        At my org it more that these AI tools finally allow the employees to get through things at all. The deadlines are getting met for the first time, maybe ever. We can at last get to the projects that will make the company money instead of chasing ghosts from 2021. The burn down charts are warm now.

  • thewebguyd 4 hours ago

    > This is going to be massive if it works as well as I suspect it might.

    Until Microsoft does its anti-competitive thing and find a way to break this in the file format, because this is exactly what copilot in excel does.

    That said, Copilot in Excel is pretty much hot garbage still so anything will be better than that.

    • NotMichaelBay 2 hours ago

      What do you mean, what is copilot in excel doing exactly?

JonChesterfield 41 minutes ago

The thing really missing from multi-megabyte excel sheets of business critical carnage was a non-deterministic rewrite tool. It'll interact excitingly with the industry standard of no automated testing whatsoever.

I 100% believe generative AI can change a spreadsheet. Turn the xslx into text, mutate that, turn it back into an xslx, throw it away if it didn't parse at all. The result will look pretty similar to the original too, since spreadsheets are great at showing immediately local context and nothing else.

Also, we've done a pretty good job of training people that chatgpt works great, so there's good reason for them to expect claude for excel to work great too.

I'd really like the results of this to be considered negligence with non-survivable fines for the reckless stupidity, but more likely, it'll be seen as an act of god. Like all the other broken shit in the IT world.

gwbas1c 31 minutes ago

I wonder if this will be more/less useful than what we have with AI in software development.

There's a lot less to understand than a whole codebase.

I don't do spreadsheets very often, but I can emphasize with tracking down "Trace #REF!, #VALUE!, and circular reference errors to their source in seconds." I once hit something like that, and I found it a lot harder to trace a typical compiler error.

mattas 4 hours ago

I'm not excited about having LLMs generate spreadsheets or formulas. But, I think LLMs could be particularly useful in helping me find inconsistent formulas or errors that are challenging to identify. Especially in larger, complex spreadsheets touched by multiple people over the course of months.

  • thesuitonym 3 hours ago

    For once in my life, I actually had a delightful interaction with an LLM last week. I was changing some text in an Excel sheet in a very progromatic way that could have easily been done with the regex functions in Excel. But I'm not really great with regex, and it was only 15 or so cells, so I was content to just do it manually. After three or four cells, Copilot figured out what I was doing and suggested the rest of the changes for me.

    This is what I want AI to do, not generate wrong answers and hallucinate girlfriends.

  • bambax 2 hours ago

    One approach is to produce read-only data in BI tools: users are free to export anything they want and make their own spreadsheets, but those are for their own use only. Reference data is produced every day by a central, controlled process and cannot in any circumstance be modified by the end user.

    I have implemented this a couple of times and not only does it work well, it tends to be fairly well accepted. People need spreadsheets to work on them, but generally they kind of hate sending those around via email. Having a reference source of data is welcomed.

michaelmarkell 3 hours ago

IMO, a real solution here has to be hybrid, not full LLM, because these sheets can be massive and have very complicated structures. You want to be able to use the LLM to identify / map column headers, while using non-LLM tool calling to run Excel operations like SUMIFs or VLOOKUPs. One of the most important traits in these systems is consistency with slight variation in file layout, as so much Excel work involves consolidating / reconciling between reports made on a quarterly basis or produced by a variety of sources, with different reporting structures.

Disclosure: My company builds ingestion pipelines for large multi-tab Excel files, PDFs, and CSVs.

  • dcre 3 hours ago
    • levocardia 2 hours ago

      "This won't work because (something obvious that engineers at Anthropic clearly thought of already)"

      • michaelmarkell an hour ago

        Not really. Take for example:

        item, date, price

        abc, 01/01/2023, $30

        cde, 02/01/2023, $40

        ... 100k rows ...

        subtotal. $1000

        def, 03/01,2023, $20

        "Hey Claude, what's the total from this file? > grep for headers > "Ah, I see column 3 is the price value" > SUM(C3:C) -> $2020 > "Great! I found your total!"

        If you can find me an example of tech that can solve this at scale on large, diverse Excel formats, then I'll concede, but I haven't found something actually trustworthy for important data sets

  • sunnybeetroot 3 hours ago

    So more or less like what AI has been doing for the last couple of years when it comes to writing code?

rahimnathwani an hour ago

How is this different from the existing Claude skill, that uses a prompt and pandas to edit an Excel file?

https://github.com/anthropics/skills/blob/main/document-skil...

  • shooker435 38 minutes ago

    This isn't built for Excel users who use Github and Claude Skills, it's built for Excel users who would run away from Git commands.

    • rahimnathwani 36 minutes ago

      The Claude skill I linked to is built into the Claude desktop client. You just attach an Excel file to your chat and ask away.

      I linked to the skill prompt just to more clearly explain the approach that's currently available to all Claude users.

      It requires zero familiarity with git or command line.

patife 40 minutes ago

Fodasse a Rows é pelo menos 3x melhor

travisgriggs 2 hours ago

As I was reading through the post, and the comments here, and pondering my own many hours with these tools, I was suddenly reminded of one of my favorite studio C sketches: An Unfortunate Fortune

https://www.youtube.com/watch?v=SF-psoWdSpo

Curious, if others see the connection. :D

warthog 4 hours ago

Tough day to be an AI Excel add-in startup

  • mitjam an hour ago

    Ask Rosie is actually shutting down right now: https://www.askrosie.ai/

    I would love to learn more about their challenges as I have been working on an Excel AI add-in for quite some time and have followed Ask Rosie from almost their start.

    That they now gone through the whole cycle worries me I‘m too slow as a solo building on the side in these fast paced times.

  • 8note 3 hours ago

    its a great time for your ai excel add-in to start getting acquired by a claude competitor though

    • NotMichaelBay 2 hours ago

      Not OpenAI, though, because they already gave $14M to an AI Excel add-in startup (Endex)

  • jonathanstrange 4 hours ago

    That seems to be true for any startup that offers a wrapper to existing AIs rather than an AI on their own. The lucky ones might be bought but many if not most of them will perish trying to compete with companies that actually create AI models and companies large enough to integrate their own wrappers.

unshavedyak an hour ago

Dumb question, but is this Claude for Excel the.. app? The webapp? Does it work on Google sheets? etc

There are quite a few spreadsheet apps out there, just curious what their implementation is or how it's implemented to work with multiple apps.

I always find Excel (and the Office ecosystem) confusing heh.

  • p_ing an hour ago

    Modern Excel add-ins work in desktop Windows, macOS, and web. They're just a bit of XML that Excel looks at to call a whatever web endpoint is defined in the XML.

garyclarke27 4 hours ago

I guess Claude maybe useful for finding errors in large Excel Workbooks. May also help beginners to learn the more complex Excel functions (which are still pretty easy). But if you are proficient at building Excel models I don't see any benefit. Excel already has a superb very efficient UI for entering formulas, ranges, tables, data sources etc I'm sceptical that a different UI especially a text based one can improve on this.

  • proteal 4 hours ago

    I understand the sentiment about a skilled user not needing this, but I think having a little buddy that I can use to offload some menial tasks would be helpful for me to iterate through my models more efficiently; even if the AI is not perfect. As a highly skilled excel user, I admit the software has terrible ergonomics. It would be a productivity boon for me if an AI can help me stay focused on model design vs model implementation.

  • intended 3 hours ago

    For some reason, I find that these tools are TERRIBLE at helping someone learn. I suspect because turning one on, results in turning the problem solving part of ones brain off.

    Its obviously not the same experience for everyone. ( If you are one of those energized while working in a chat window, you might be in a minority - given what we see from the ongoing massacre of brains in education. )

    Paraphrasing something I read here "people don't use ChatGPT to do learn more, they use it to study less".

    Maybe some folk would be better off.

vjvjvjvjghv 2 hours ago

Hope it’s better than what MS is currently shipping as AI. Everything I try to do something, the response is “sorry, I can’t do this”.

  • smithkl42 an hour ago

    Copilot is getting better - I'm getting fewer of those than I used to - but it's still significantly more stupid than other agents, even when in theory it's using the same model.

mamonster an hour ago

On the one hand, most financial companies have a lot of processes in Excel that could be made better with something like Claude.

Banking secrecy laws + customer identifying data + AI tool = No bueno.

ed_elliott_asc 43 minutes ago

I use excel but not for financial modelling, I’ll use it

jawns 5 hours ago

Gemini already has its hooks in Google Sheets, and to be honest, I've found it very helpful in constructing semi-complicated Excel formulas.

Being able to select a few rows and then use plain language to describe what I want done is a time saver, even though I could probably muddle through the formulas if I needed to.

  • gumby271 4 hours ago

    Last time I tried using Gemini in Google Sheets it hallucinated a bunch of fake data, then gave me a summary that included all that fake data. I'd given it a bunch of transaction data, and asked it to group the records into different categories for budgeting. When asking it to give the largest values in each category, all the values that came back were fake. I'm not sure I'd really trust it to touch a spreadsheet after that.

    • genrader 3 hours ago

      you should:

      -stop using the free plan -don't use gemini flash for these tasks -learn how to do things over time and know that all ai models have improved significantly every few months

      • ipaddr 2 hours ago

        Or not use it.

  • break_the_bank 5 hours ago

    I would recommend trying TabTabTab at https://tabtabtab.ai/

    It is an entire agent loop. You can ask it to build a multi sheet analysis of your favorite stock and it will. We are seeing a lot of early adopters use it for financial modeling, research automation, and internal reporting tasks that used to take hours.

  • break_the_bank 2 hours ago

    I forgot to add, you can try TabTabTab, without installing anything as well.

    To see something much more powerful on Google Sheets than Gemini for free, you can add "try@tabtabtab.ai" to your sheet, and make a comment tagging "try@tabtabtab.ai" and see it in action.

    If that is too much just go to ttt.new!

  • frankfrank13 4 hours ago

    I have had the opposite experience. I've never had Gemini give me something useful in sheets, and I'm not asking for complicated things. Like "group this data by day" or "give me p50 and p90"

  • dangoodmanUT 4 hours ago

    Gemini integratoins to Google workspace feels like it's using Gemini 1.5 flash, it's so comically bad at understanding and generating

pdyc 3 hours ago

I have just launched a product (easyanalytica.com) to create dashboards from spreadsheets, and Excel is on my to-do list of formats to be supported. However, I'm having second thoughts. Although, from the description, it seems like it would be more helpful on the modeling side rather than the presentation side. I guess I'll have to wait until it's publicly available

  • sunnybeetroot 3 hours ago

    Why second thoughts?

    • pdyc 3 hours ago

      everyone will use claude if they support it why would they use my product. so i will have to find some other angle to differentiate.

humanfromearth9 2 hours ago

This could be invaluable for reverse engineering complex workbooks with multiple data sources and hundreds or thousands of formulas.

  • pumnikol an hour ago

    If it has a concept of data sources and can digest them, sure. Anecdotally, most issues with Excel at my job are caused by data sources being renamed, moved or reformatted, by broken logins, or by insufficient access rights.

asdev 5 hours ago

George Hotz said there's 5 tiers of AI systems, Tier 1 - Data centers, Tier 2 - fabs, Tier 3 - chip makers, Tier 4 - frontier labs, Tier 5 - Model wrappers. He said Tier 4 is going to eat all the value of Tier 5, and that Tier 5 is worthless. It's looking like that's going to be the case

  • mediaman 4 hours ago

    That is a common refrain by people who have no domain expertise in anything outside of tech.

    Spend a few years in an insurance company, a manufacturing plant, or a hospital, and then the assertion that the frontier labs will figure it out appears patently absurd. (After all, it takes humans years to understand just a part of these institutions, and they have good-functioning memory.)

    This belief that tier 5 is useless is itself a tell of a vulnerability: the LLMs are advancing fastest in domain-expertise-free generalized technical knowledge; if you have no domain expertise outside of tech, you are most vulnerable to their march of capability, and it is those with domain expertise who will rely increasingly less on those who have nothing to offer but generalized technical knowledge.

    • asdev 3 hours ago

      yeah but if Anthropic/OpenAI dedicate resources to gaining domain expertise then any tier 5 is dead in the water. For example, they recently hired a bunch of finance professionals to make specialized models for financial modeling. Any startup in that space will be wiped out

    • HDThoreaun 2 hours ago

      I dont think the claim is exactly that tier 5 is useless more that tier 5 synergizes so well with tier 4 that all the popular tier 5 products will eventually be made by the tier 4 companies.

  • mitjam an hour ago

    Andrew Ng argumented in 2023 (https://www.youtube.com/watch?v=5p248yoa3oE ) that the underlying tiers depend on the app tier‘s success.

    That OpenAI is now apparantly striving to become the next big app layer company could hint at George Hotz being right but only if the bets work out. I‘m glad that there is competition on the frontier labs tier.

  • rudedogg 4 hours ago

    Tier 5 requires domain expertise until we reach AGI or something very different from the latest LLMs.

    I don’t think the frontier labs have the bandwidth or domain knowledge (or dare I say skills) to do tier 5 tasks well. Even their chat UIs leave a lot to be desired and that should be their core competency.

  • extr 4 hours ago

    George Hotz says a lot of things. I think he's directionally correct but you could apply this argument to tech as a whole. Even outside of AI, there are plenty of niches where domain-specific solutions matter quite a bit but are too small for the big players to focus on.

  • matsur 4 hours ago

    People were saying the same thing about AWS vs SaaS ("AWS wrappers") a decade ago and none of that came to pass. Same will be true here.

  • tln 4 hours ago

    Claude is a model wrapper, no?

    • piperswe 4 hours ago

      Anthropic is a frontier lab, and Claude is a frontier model

  • benatkin 4 hours ago

    Interesting. I found a reference to this in a tweet [1], and it looks to be a podcast. While I'm not extremely knowledgable. I'd put it like this: Tier 1 - fabs, Tier 2 - chip makers, Tier 3 - data centers, Tier 4 - frontier labs, Tier 5 - Model wrappers

    However I would think more of elite data centers rather than commodity data centers. That's because I see Tier 4 being deeply involved in their data centers and thinking of buying the chips to feed their data centers. I wouldn't be so inclined to throw in my opinion immediately if I found an article showing this ordering of the tiers, but being a tweet of a podcast it might have just been a rough draft.

    1: https://x.com/tbpn/status/1935072881425400016

wonderwonder 22 minutes ago

Been working with Claude Code lately and been pretty impressed. If this works as well could be a nice add on. Its probably a smart market to enter as Excel is essentially everywhere.

Just like Claude Code allows 1 dev to potentially do the work of 2 or 3, I could see this allowing 1 accountant or operations person to do the work of 2 or 3. Financial savings but human cost

soared 5 hours ago

It’s interesting to me that this page talks a lot about “debugging models” etc. I would’ve expected (from the title) this to be going after the average excel user, similar to how chatgpt went after every day people.

I would’ve expected “make a vlookup or pivot table that tells me x” or “make this data look good for a slide deck” to be easier problems to solve.

  • layer8 4 hours ago

    The issue is that the average Excel user doesn’t quite have the skills to validate and double-check the Excel formulas that Claude would produce, and to correct them if needed. It would be similar to a non-programmer vibe-coding an app. And that’s really not what you want to happen for professionally used Excel sheets.

    • soared 4 hours ago

      IMO that is exactly what people want. At my work everyone uses LLMs constantly and the trade off of not perfect information is known. People double check it, etc, but the information search is so much faster even if it finds the right confluence but misquotes it, it still sends me the link.

      For easy spreadsheet stuff (which 80% of average white collars workers are doing when using excel) I’d imagine the same approach. Try to do what I want, and even if you’re half wrong the good 50% is still worth it and a better starting point.

      Vibe coding an app is like vibe coding a “model in excel”. Sure you could try, but most people just need to vibe code a pivot table

  • extr 4 hours ago

    I think actually Anthropic themselves are having trouble with imagining how this could be used. Coders think like coders - they are imagining the primary use case being managing large Excel sheets that are like big programs. In reality most Excel worksheets are more like tiny, one-off programs. More like scripts than applications. AI is very very good at scripts.

  • burkaman 5 hours ago

    I think this is aiming to be Claude Code for people who use Excel as a programming environment.

mainecoder 42 minutes ago

Yeah now tell the Auditors that the financial spreadsheet we have here has AI touching it left and right. "I did not cook the books I promise it is the AI that made our financials seem better than they actually are trust me bro!", said Joe from Accounting.

burkaman 5 hours ago

I'm excited to see what national disasters will be caused by auto-generated Excel sheets that nobody on the planet understands. A few selections from past HN threads to prime your imagination:

Thousands of unreported COVID cases: https://news.ycombinator.com/item?id=24689247

Thousands of errors in genetics research papers: https://news.ycombinator.com/item?id=41540950

Wrong winner announced in national election: https://news.ycombinator.com/item?id=36197280

Countries across the world implement counter-productive economic austerity programs: https://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt#Metho...

  • HPsquared 5 hours ago

    Especially combined with the dynamic array formulas that have recently been added (LET, LAMBDA etc). You can have much more going on within each cell now. Think whole temporary data structures. The "evaluate formula" dialog doesn't quite cut it anymore for debugging.

  • malthaus 4 hours ago

    from my experience in the corporate world, i'd trust an excel generated / checked by an LLM more than i would one that has been organically grown over years in a big corporation where nobody ever checks or even can check anything because its one big growing pile of technical debt people just accept as working

d--b 5 hours ago

Ok, they weren't confident enough to let the model actually edit the spreadsheet. Phew..

Only a matter of time before someone does it though.

  • password4321 4 hours ago

    How well does change tracking work in Excel... how hard would it be to review LLM changes?

    AFAIK there is no 'git for Excel to diff and undo', especially not built-in (aka 'for free' both cost-wise and add-ons/macros not allowed security-wise).

    My limited experience has been that it is difficult to keep LLMs from changing random things besides what they're asked to change, which could cause big problems if unattackable in Excel.

    • NewsaHackO an hour ago

      I thought there was track changes on all office products. Most Office documents are zip files of XML files and assets, so I'd imagine it would be possible to rollback changes.

  • cube00 5 hours ago

    When I think how easy I can misclick to stuff up a spreadsheet I can't begin to imagine all the subtle ways LLMs will screw them up.

    Unlike code where it's all on display, with all these formulas are hidden in each cell, you won't see the problem unless click on the cell so you'll have a hard time finding the cause.

  • tln 4 hours ago

    I wish Gemini could edit more in Google sheets and docs.

    Little stuff like splitting text more intelligently or following the formatting seen elsewhere would be very satisfying.

surume an hour ago

Checkmate, Altman

intended 4 hours ago

As an inveterate Excel lover, I can just sense the blinding pain wafting off the legions of accountants, associates, seniors, and tech people who keep the machine spirits placated.

lies, damn lies, statistics, and then Excel deciding cell data types.

grim_io an hour ago

If this works very well and reliable, it might not kill programming as such, but it might put a lot of small businesses who do custom software for other small businesses out of work.

The HN bubble might not realize the implications.

keernan 2 hours ago

If AI turns out to be the powerhouse it is claimed to be, AI's impact will be corporations replacing corporate dependencies upon 'Excel projects' created by self-taught assistants to department managers.

gedy 4 hours ago

Cool but now companies POs will be like "you must add the Excel export for all the user data!" and when asked why, will basically be "so I can do this roundabout query of data for some number in a spreadsheet using AI (instead of just putting the number or chart directly in the product with a simple db call)"

racl101 4 hours ago

This could be huge! Very exciting!

strange_quark 5 hours ago

Yet more evidence of the bubble burst being imminent. If any of these companies really had some almost-AGI system internally, they wouldn’t be spending any effort making f’ing Excel plugins. Or at the very least, they’d be writing their own Excel because AI is so amazing at coding, right?

  • mitjam an hour ago

    Excel is living business knowlege stuck in private SharePoint Sites, tappimg into it might kick off a nice data flywheel not to speak of the nice TAM.

  • ipaddr an hour ago

    You make a great point. Where is all of the complex applications? They haven't been able to create than own office suite or word processor or really anything aside from a halloween matching game in js. You would think we would have some complex application they can point to but nothing.

  • HDThoreaun 2 hours ago

    The current valuations do not require AGI. They require products like this that will replace scores of people doing computer based grunt work. MSFT is worth $4 trillion off the back of enterprise productivity software, the AI labs just need some of that money.

  • pton_xd 3 hours ago

    The fine tuning will continue until we reach AGI.

    • amlib an hour ago

      The fine tuning will continue until we reach the torment nexus, at best

  • qsort 5 hours ago

    You wouldn't believe the amount of shit that runs on Excel.

    • powvans 5 hours ago

      Yes. I once interviewed a developer who’s previous job was maintaining the .NET application that used an Excel sheet as the brain for decisions about where to drill for oil on the sea floor. No one understood what was in the Excel sheet. It was built by a geologist who was long gone. The engineering team understood the inputs and outputs. That’s all they needed to know.

      • mwigdahl 4 hours ago

        Years ago when I worked for an engineering consulting company we had to work with a similarly complex, opaque Excel spreadsheet from General Electric modeling the operation of a nuclear power plant in exacting detail.

        Same deal there -- the original author was a genius and was the only person who knew how it was set up or how it worked.

    • strange_quark 3 hours ago

      I think you’re misunderstanding me. This might be something somewhat useful, I don’t know, and I’m not judging it based on that.

      What I’m saying is that if you really believed we were 2, maybe 3 years tops from AGI or the singularity or whatever you would spend 0 effort serving what already seems to be a domain that is already served by 3rd parties that are already using your models! An excel wrapper for an LLM isn’t exactly cutting edge AI research.

      They’re desperate to find something that someone will pay a meaningful amount of money for that even remotely justifies their valuation and continued investment.

    • cube00 5 hours ago

      I spotted a custom dialog in an Excel spreadsheet in a medical context the other day, I was horrified.

    • efields 5 hours ago

      This. I work in Pharma. Excel and faxes.

  • FergusArgyll 5 hours ago

    A program that can do excel for you is almost AGI

tanksterzen 4 hours ago

[flagged]

  • Aboutplants 3 hours ago

    I’ve got some bad news about the prospects of your startup

  • thesuitonym 3 hours ago

    Learn > Documentation is just a single markdown doc?

  • MarcelOlsz 4 hours ago

    Fresh account spam on my HN? Buy an ad somewhere.

    Aaaaand it's gone!

cube00 5 hours ago

[flagged]

  • sdsd 5 hours ago

    Okay. But then you could say the same for a human, isn't your brain just a cloud of matter and electricity that just reacts to senses deterministically?

    • cube00 5 hours ago

      > isn't your brain just a cloud of matter and electricity that just reacts to senses deterministically?

      LLMs are not deterministic.

      I'd argue over the short term humans are more deterministic. I ask a human the same question multiple times and I get the same answer. I ask an LLM and each answer could be very different depending on its "temperature".

      • krzyk 4 hours ago

        If you ask human the same question repeatedly, you'll get different answers. I think that at third you'll get "I already answered that" etc.

    • worldsayshi 4 hours ago

      We hardly react to things deterministically.

      But I agree with the sentiment. It seems it is more important than ever to agree on what it means to understand something.

    • qwertox 4 hours ago

      I'm having a bad day today. I'm 100% certain that today I'll react completely different to any tiny issue compared to how I did yesterday.

  • NDizzle 4 hours ago

    I mean - try clicking the CoPilot button and see what it can actually do. Last I checked, it told me it couldn't change any of the actual data itself, but it could give you suggestions. Low bar for excellence here.