So this is really kind of tangential, I am just messing around with the data and trying to at least nail down some basics so I can do sanity checks in my actual code. I am running into an odd discrepancy when trying to determine the number of times “XMAS” occurs in the input. I am only concerned with the straight forward matches, the instances of the substring “XMAS” appearing in the raw data.

When i do “grep -c XMAS input” or “rg -c XMAS input” they both show 107. But when I use regex101.com and search for the pattern XMAS, it shows 191 matches. Please help, I am truly at a loss here. While writing this, it occurred to me to just try using string.count(“XMAS”) in python on the data as a raw string, and it also returns 191. So really this question is more about grep and rp than anything. why are they only returning 107?

  • PhilipTheBucketA
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    8 days ago

    Which input are you using? https://adventofcode.com/2024/day/4/input? It looks like you’re not accounting right for times when XMAS appears multiple times per line. grep counts the number of lines where something appears, not the total number of times it appears. Here’s a quick way to do it, although my numbers don’t match yours:

    $ grep -c XMAS words
    112
    $ grep -c 'XMAS.*XMAS' words
    56
    $ grep -c 'XMAS.*XMAS.*XMAS' words
    22
    $ grep -c 'XMAS.*XMAS.*XMAS.*XMAS' words
    7
    $ grep -c 'XMAS.*XMAS.*XMAS.*XMAS.*XMAS' words
    1
    $ grep -c 'XMAS.*XMAS.*XMAS.*XMAS.*XMAS.*XMAS' words
    0
    $ echo '1*5+(7-1)*4+(22-7)*3+(56-22)*2+(112-56)' | bc
    198
    

    So 198 instances of XMAS, appearing horizontally from left to right.

    Also, yes, I could have just added up all the numbers, I didn’t realize that right away.

    Edit: Better:

    $ grep -o XMAS words | wc -l
    198