Planet Haskellhttp://planet.haskell.org/enPlanet Haskell - http://planet.haskell.org/Mark Jason Dominus: Software horror show: SAP Concurtag:,2022:/prog/crap-warning-signs-2https://blog.plover.com/prog/crap-warning-signs-2.html<p>This complaint is a little stale, but maybe it will still be
interesting. A while back I was traveling to California on business
several times a year, and the company I worked for required that I use
<a href="https://www.concur.com/">SAP Concur</a> expense management software to
submit receipts for reimbursement.</p>
<p>At one time I would have had many, many complaints about Concur. But
today I will make only one. Here I am trying to explain to the Concur
phone app where my expense occurred, maybe it was a cab ride from the
airport or something.</p>
<p><img alt="Screenshot of a phone app with the title “Location Search”. In the input box I have typed ‘los a’. The list of results, in order, is: None; Los Andes, CHILE; Los Angeles, CHILE; Los Alcazares, SPAIN; Los Altos Hills, California; Los Alamos, New Mexico; Los Alamitos, Californoia, Los Angles, California; Los Altos, California; Los Alamos, California; Los Alcarrizos, DOMINICaliforniaN REPUBLIC; Loc Arcos, SPAIN; Los Anauicos, VENEZUELA" class="center" src="https://pic.blog.plover.com/prog/crap-warning-signs-2/concur.png" /></p>
<p>I had to interact with this control every time there was another
expense to report, so this is part of the app's core functionality.</p>
<p>There are a lot of good choices about how to order this list.
The best ones require some work.
The app might use the phone's location feature to figure out where it is and
make an educated guess about how to order the place names. (“I'm in
California, so I'll put those first.”)
It could keep a count of how often this user has chosen each location
before, and put most commonly chosen ones first.
It could store a
list of the locations the user has selected before and put the
previously-selected ones before the ones that had never been selected.
It could have asked, when the expense report was first created, if there
was an associated location, say “California”, and then and then used
that to put California places first, then United States places, then
the rest.
It could have a hardwired list of the importance of each place (or some
proxy for that, like population) and put the most important places at
the top.</p>
<p>The actual authors of SAP Concur's phone app did none of these
things. I understand. Budgets are small, deadlines are tight,
product managers can be pigheaded. Sometimes the programmer doesn't
have the resources to do the best solution.</p>
<p>But this list isn't even alphabetized.</p>
<p>There are two places named Los Alamos; they are not adjacent. There
are two places in Spain; they are also not adjacent. This is
inexcusable. There is no resource constraint that is so stringent that
it would prevent the programmers from replacing</p>
<pre><code> displaySelectionList(matches)
</code></pre>
<p>with</p>
<pre><code> displaySelectionList(matches.sorted())
</code></pre>
<p>They just didn't.</p>
<p>And then whoever reviewed the code, if there was a code review, didn't
say “hey, why didn't you use <code>displaySortedSelectionList</code> here?”</p>
<p>And then the product manager didn't point at the screen and say
“wouldn't it be better to alphabetize these?”</p>
<p>And the UX person, if there was one, didn't raise any red flag, or if
they did nothing was done.</p>
<p>I don't know what Concur's software development and release process is
like, but somehow it had a complete top-to-bottom failure of quality
control and let this shit out the door.</p>
<p>I would love to know how this happened.
<a href="https://blog.plover.com/tech/stadiometer.html">I said a while back</a>:</p>
<blockquote>
<p>Assume that bad technical decisions are made rationally, for reasons that are not apparent.</p>
</blockquote>
<p>I think this might be a useful counterexample. And if it isn't, if
the individual decision-makers all made choices that were locally
rational, it might be an instructive example on how an organization
can be so dysfunctional and so filled with perverse incentives that it
produces a stack of separately rational decisions that somehow add up
to a failure to alphabetize a pick list.</p>
<h3>Addendum : A possible explanation</h3>
<p>Dennis Felsing, a former employee of SAP working on their
<a href="https://en.wikipedia.org/wiki/SAP_HANA">HANA database</a>, has suggested how this might have
come about. Suppose that the app originally used a database that
produced the results already sorted, so that no sorting in the client
was necessary, or at least any omitted sorting wouldn't have been
noticed. Then later, the backend database was changed or upgraded to
one that didn't have the autosorting feature. (This might have
happened when Concur was acquired by SAP, if SAP insisted on
converting the app to use HANA instead of whatever it had been using.)</p>
<p>This change could have broken many similar picklists in the same way.
Perhaps there was large and complex project to replace the database
backend, and the unsorted picklist were discovered relatively late and
were among the less severe problems that had to be overcome.
I said “there is no resource constraint that is so stringent that
it would prevent the programmers from (sorting the list)”. But if
fifty picklists broke all at the same time for the same reason? And
you weren't sure where they all were in the code?
At the tail end of a large, difficult project? It might have made
good sense to put off the minor problems like unsorted picklists for
a future development cycle. This seems quite plausible, and if it's
true, then this is <em>not</em> a counterexample of “bad technical decisions
are made rationally for reasons that are not apparent”.
(I should add, though, that the sorting issue was not fixed in the
next few years.)</p>
<p>In the earlier article I said “until I got the correct explanation,
the only explanation I could think of was unlimited incompetence.”
That happened this time also! I could not imagine a plausible
explanation, but M. Felsing provided one that was so plausible I could
imagine making the decision the same way myself. I wish I were better
at thinking of this kind of explanation.</p>Sun, 04 Dec 2022 17:58:00 +0000mjd@plover.com (Mark Dominus)<td class="mainsection" bgcolor="#f3f3f3" morss_own_score="5.705950991831972" morss_score="71.67143562540171">
<br>
<span>Sun, 04 Dec 2022</span>
<p>
<a href="https://blog.plover.com/prog/crap-warning-signs-2.html">Software horror show: SAP Concur</a>
<br>
</p><p>This complaint is a little stale, but maybe it will still be
interesting. A while back I was traveling to California on business
several times a year, and the company I worked for required that I use
<a href="https://www.concur.com/">SAP Concur</a> expense management software to
submit receipts for reimbursement.</p>
<p>At one time I would have had many, many complaints about Concur. But
today I will make only one. Here I am trying to explain to the Concur
phone app where my expense occurred, maybe it was a cab ride from the
airport or something.</p>
<p><img src="https://pic.blog.plover.com/prog/crap-warning-signs-2/concur.png"></p>
<p>I had to interact with this control every time there was another
expense to report, so this is part of the app's core functionality.</p>
<p>There are a lot of good choices about how to order this list.
The best ones require some work.
The app might use the phone's location feature to figure out where it is and
make an educated guess about how to order the place names. (“I'm in
California, so I'll put those first.”)
It could keep a count of how often this user has chosen each location
before, and put most commonly chosen ones first.
It could store a
list of the locations the user has selected before and put the
previously-selected ones before the ones that had never been selected.
It could have asked, when the expense report was first created, if there
was an associated location, say “California”, and then and then used
that to put California places first, then United States places, then
the rest.
It could have a hardwired list of the importance of each place (or some
proxy for that, like population) and put the most important places at
the top.</p>
<p>The actual authors of SAP Concur's phone app did none of these
things. I understand. Budgets are small, deadlines are tight,
product managers can be pigheaded. Sometimes the programmer doesn't
have the resources to do the best solution.</p>
<p>But this list isn't even alphabetized.</p>
<p>There are two places named Los Alamos; they are not adjacent. There
are two places in Spain; they are also not adjacent. This is
inexcusable. There is no resource constraint that is so stringent that
it would prevent the programmers from replacing</p>
<pre><code> displaySelectionList(matches)
</code></pre>
<p>with</p>
<pre><code> displaySelectionList(matches.sorted())
</code></pre>
<p>They just didn't.</p>
<p>And then whoever reviewed the code, if there was a code review, didn't
say “hey, why didn't you use <code>displaySortedSelectionList</code> here?”</p>
<p>And then the product manager didn't point at the screen and say
“wouldn't it be better to alphabetize these?”</p>
<p>And the UX person, if there was one, didn't raise any red flag, or if
they did nothing was done.</p>
<p>I don't know what Concur's software development and release process is
like, but somehow it had a complete top-to-bottom failure of quality
control and let this shit out the door.</p>
<p>I would love to know how this happened.
<a href="https://blog.plover.com/tech/stadiometer.html">I said a while back</a>:</p>
<blockquote>
<p>Assume that bad technical decisions are made rationally, for reasons that are not apparent.</p>
</blockquote>
<p>I think this might be a useful counterexample. And if it isn't, if
the individual decision-makers all made choices that were locally
rational, it might be an instructive example on how an organization
can be so dysfunctional and so filled with perverse incentives that it
produces a stack of separately rational decisions that somehow add up
to a failure to alphabetize a pick list.</p>
<h3>Addendum : A possible explanation</h3>
<p>Dennis Felsing, a former employee of SAP working on their
<a href="https://en.wikipedia.org/wiki/SAP_HANA">HANA database</a>, has suggested how this might have
come about. Suppose that the app originally used a database that
produced the results already sorted, so that no sorting in the client
was necessary, or at least any omitted sorting wouldn't have been
noticed. Then later, the backend database was changed or upgraded to
one that didn't have the autosorting feature. (This might have
happened when Concur was acquired by SAP, if SAP insisted on
converting the app to use HANA instead of whatever it had been using.)</p>
<p>This change could have broken many similar picklists in the same way.
Perhaps there was large and complex project to replace the database
backend, and the unsorted picklist were discovered relatively late and
were among the less severe problems that had to be overcome.
I said “there is no resource constraint that is so stringent that
it would prevent the programmers from (sorting the list)”. But if
fifty picklists broke all at the same time for the same reason? And
you weren't sure where they all were in the code?
At the tail end of a large, difficult project? It might have made
good sense to put off the minor problems like unsorted picklists for
a future development cycle. This seems quite plausible, and if it's
true, then this is <em>not</em> a counterexample of “bad technical decisions
are made rationally for reasons that are not apparent”.
(I should add, though, that the sorting issue was not fixed in the
next few years.)</p>
<p>In the earlier article I said “until I got the correct explanation,
the only explanation I could think of was unlimited incompetence.”
That happened this time also! I could not imagine a plausible
explanation, but M. Felsing provided one that was so plausible I could
imagine making the decision the same way myself. I wish I were better
at thinking of this kind of explanation.</p>
<br>
<br>
</td>
Monday Morning Haskell: Day 4 - Overlapping Ranges584219d403596e3099e0ee9b:58462c0e15d5db6feba171c0:638c340c505e332842cbd0a6https://mmhaskell.com/blog/2022/12/4/day-4-overlapping-ranges<figure class=" sqs-block-image-figure intrinsic ">
<img alt="" class="thumb-image" src="https://images.squarespace-cdn.com/content/v1/584219d403596e3099e0ee9b/68bafcb2-5a23-4156-9f6e-d1d3a35d57e0/Advent+of+Haskell+4%21.jpg?format=1000w" />
</figure>
<p><a href="https://github.com/MondayMorningHaskell/AdventOfCode/blob/aoc-2022/src/Day4.hs">Solution code on GitHub</a></p>
<p><a href="https://www.mmhaskell.com/advent-of-code-2022">All 2022 Problems</a></p>
<p><a href="https://www.mmhaskell.com/subscribe">Subscribe to Monday Morning Haskell</a>!</p>
<h2 id="problem-overview">Problem Overview</h2>
<p><a href="https://adventofcode.com/2022/day/4">Full Description</a></p>
<p>For today's problem, our elf friends are dividing into pairs and cleaning sections of the campsite. Each individual elf is then assigned a range of sections of the campsite to clean. Our goal is to figure out redundant work.</p>
<p>In part 1, we want to calculate the number of pairs where one range is fully contained within the other. In part 2, we'll figure out how many pairs of ranges have <em>any</em> overlap.</p>
<h2 id="relevant-utilities">Relevant Utilities</h2>
<p>We'll be parsing a lot of numbers for this puzzle, so we'll need a handy function for that. Here's <code>parsePositiveNumber</code>:</p>
<pre><code class="lang-haskell">parsePositiveNumber :: (Monad m) => ParsecT Void Text m Int
parsePositiveNumber = read <$> some digitChar</code></pre>
<h2 id="parsing-the-input">Parsing the Input</h2>
<p>Now let's look at the sample input:</p>
<pre><code>2-4,6-8
2-3,4-5
5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8</code></pre><p>Again, we parse this line-by-line. And each line just consists of a few numbers interspersed with other characters.</p>
<pre><code class="lang-haskell">parseInput :: (MonadLogger m) => ParsecT Void Text m InputType
parseInput =
sepEndBy1 parseLine eol
type InputType = [LineType]
type LineType = ((Int, Int), (Int, Int))
parseLine :: (MonadLogger m) => ParsecT Void Text m LineType
parseLine = do
a1 <- parsePositiveNumber
char '-'
a2 <- parsePositiveNumber
char ','
b1 <- parsePositiveNumber
char '-'
b2 <- parsePositiveNumber
return ((a1, a2), (b1, b2))</code></pre>
<h2 id="getting-the-solution">Getting the Solution</h2>
<p>In part 1, we count the number of lines where one range fully contains another. In the example above, these two lines satisfy this condition:</p>
<pre><code>2-8,3-7
6-6,4-6</code></pre><p>So we start with a function to evaluate this:</p>
<pre><code class="lang-haskell">rangeFullyContained :: ((Int, Int), (Int, Int)) -> Bool
rangeFullyContained ((a1, a2), (b1, b2)) =
a1 <= b1 && a2 >= b2 ||
b1 <= a1 && a2 <= b2</code></pre>
<p>And now we use the same folding pattern that's served us for the last couple days! If the condition is satisfied, we add one to the previous score, otherwise no change.</p>
<pre><code class="lang-haskell">processInputEasy :: (MonadLogger m) => InputType -> m EasySolutionType
processInputEasy = foldM foldLine initialFoldV
type FoldType = Int
initialFoldV :: FoldType
initialFoldV = 0
foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine prev range = if rangeFullyContained range
then return $ prev + 1
else return prev</code></pre>
<h2 id="part-2">Part 2</h2>
<p>Part 2 is virtually identical, only with a different condition. In the above example, here are the examples with <em>any</em> overlap in the ranges:</p>
<pre><code>5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8</code></pre><p>So here's our new condition:</p>
<pre><code class="lang-haskell">rangePartiallyContained :: ((Int, Int), (Int, Int)) -> Bool
rangePartiallyContained ((a1, a2), (b1, b2)) = if a1 <= b1
then b1 <= a2
else a1 <= b2</code></pre>
<p>And the application of this condition is virtually identical to part 1.</p>
<pre><code class="lang-haskell">processInputHard :: (MonadLogger m) => InputType -> m HardSolutionType
processInputHard = foldM foldPart2 0
findHardSolution :: (MonadLogger m) => HardSolutionType -> m (Maybe Int)
findHardSolution _ = return Nothing
foldPart2 :: (MonadLogger m) => Int -> LineType -> m Int
foldPart2 prev range = if rangePartiallyContained range
then return $ prev + 1
else return prev</code></pre>
<h2 id="answering-the-question">Answering the Question</h2>
<p>Nothing has changed from our previous examples in terms of post-processing.</p>
<pre><code class="lang-haskell">solveEasy :: FilePath -> IO (Maybe Int)
solveEasy fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputEasy input
solveHard :: FilePath -> IO (Maybe Int)
solveHard fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputHard input</code></pre>
<p>And this means we're done!</p>
<h2 id="video">Video</h2>
<p><a href="https://youtu.be/9O-4qamIj9E">YouTube Link</a></p>Sun, 04 Dec 2022 16:00:00 +0000<div class="sqs-block-content" morss_own_score="5.673222390317701" morss_score="95.1732223903177"><p><a href="https://github.com/MondayMorningHaskell/AdventOfCode/blob/aoc-2022/src/Day4.hs">Solution code on GitHub</a></p>
<p><a href="https://www.mmhaskell.com/advent-of-code-2022">All 2022 Problems</a></p>
<p><a href="https://www.mmhaskell.com/subscribe">Subscribe to Monday Morning Haskell</a>!</p>
<h2>Problem Overview</h2>
<p><a href="https://adventofcode.com/2022/day/4">Full Description</a></p>
<p>For today's problem, our elf friends are dividing into pairs and cleaning sections of the campsite. Each individual elf is then assigned a range of sections of the campsite to clean. Our goal is to figure out redundant work.</p>
<p>In part 1, we want to calculate the number of pairs where one range is fully contained within the other. In part 2, we'll figure out how many pairs of ranges have <em>any</em> overlap.</p>
<h2>Relevant Utilities</h2>
<p>We'll be parsing a lot of numbers for this puzzle, so we'll need a handy function for that. Here's <code>parsePositiveNumber</code>:</p>
<pre><code>parsePositiveNumber :: (Monad m) => ParsecT Void Text m Int
parsePositiveNumber = read <$> some digitChar</code></pre>
<h2>Parsing the Input</h2>
<p>Now let's look at the sample input:</p>
<pre><code>2-4,6-8
2-3,4-5
5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8</code></pre><p>Again, we parse this line-by-line. And each line just consists of a few numbers interspersed with other characters.</p>
<pre><code>parseInput :: (MonadLogger m) => ParsecT Void Text m InputType
parseInput =
sepEndBy1 parseLine eol
type InputType = [LineType]
type LineType = ((Int, Int), (Int, Int))
parseLine :: (MonadLogger m) => ParsecT Void Text m LineType
parseLine = do
a1 <- parsePositiveNumber
char '-'
a2 <- parsePositiveNumber
char ','
b1 <- parsePositiveNumber
char '-'
b2 <- parsePositiveNumber
return ((a1, a2), (b1, b2))</code></pre>
<h2>Getting the Solution</h2>
<p>In part 1, we count the number of lines where one range fully contains another. In the example above, these two lines satisfy this condition:</p>
<pre><code>2-8,3-7
6-6,4-6</code></pre><p>So we start with a function to evaluate this:</p>
<pre><code>rangeFullyContained :: ((Int, Int), (Int, Int)) -> Bool
rangeFullyContained ((a1, a2), (b1, b2)) =
a1 <= b1 && a2 >= b2 ||
b1 <= a1 && a2 <= b2</code></pre>
<p>And now we use the same folding pattern that's served us for the last couple days! If the condition is satisfied, we add one to the previous score, otherwise no change.</p>
<pre><code>processInputEasy :: (MonadLogger m) => InputType -> m EasySolutionType
processInputEasy = foldM foldLine initialFoldV
type FoldType = Int
initialFoldV :: FoldType
initialFoldV = 0
foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine prev range = if rangeFullyContained range
then return $ prev + 1
else return prev</code></pre>
<h2>Part 2</h2>
<p>Part 2 is virtually identical, only with a different condition. In the above example, here are the examples with <em>any</em> overlap in the ranges:</p>
<pre><code>5-7,7-9
2-8,3-7
6-6,4-6
2-6,4-8</code></pre><p>So here's our new condition:</p>
<pre><code>rangePartiallyContained :: ((Int, Int), (Int, Int)) -> Bool
rangePartiallyContained ((a1, a2), (b1, b2)) = if a1 <= b1
then b1 <= a2
else a1 <= b2</code></pre>
<p>And the application of this condition is virtually identical to part 1.</p>
<pre><code>processInputHard :: (MonadLogger m) => InputType -> m HardSolutionType
processInputHard = foldM foldPart2 0
findHardSolution :: (MonadLogger m) => HardSolutionType -> m (Maybe Int)
findHardSolution _ = return Nothing
foldPart2 :: (MonadLogger m) => Int -> LineType -> m Int
foldPart2 prev range = if rangePartiallyContained range
then return $ prev + 1
else return prev</code></pre>
<h2>Answering the Question</h2>
<p>Nothing has changed from our previous examples in terms of post-processing.</p>
<pre><code>solveEasy :: FilePath -> IO (Maybe Int)
solveEasy fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputEasy input
solveHard :: FilePath -> IO (Maybe Int)
solveHard fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputHard input</code></pre>
<p>And this means we're done!</p>
<h2>Video</h2>
<p><a href="https://youtu.be/9O-4qamIj9E">YouTube Link</a></p>
</div>Monday Morning Haskell: Day 3 - Rucksacks and Badges584219d403596e3099e0ee9b:58462c0e15d5db6feba171c0:638ade3d04b0fd2b7afae96bhttps://mmhaskell.com/blog/2022/12/3/day-3-rucksacks-and-badges<figure class=" sqs-block-image-figure intrinsic ">
<img alt="" class="thumb-image" src="https://images.squarespace-cdn.com/content/v1/584219d403596e3099e0ee9b/dd310024-dd6a-43ad-9525-26a8b7e69e3e/Advent+of+Haskell+2%21+%281%29.jpg?format=1000w" />
</figure>
<p><a href="https://github.com/MondayMorningHaskell/AdventOfCode/blob/aoc-2022/src/Day3.hs">Solution code on GitHub</a></p>
<p><a href="https://www.mmhaskell.com/advent-of-code-2022">All 2022 Problems</a></p>
<p><a href="https://www.mmhaskell.com/subscribe">Subscribe to Monday Morning Haskell</a>!</p>
<h2 id="problem-overview">Problem Overview</h2>
<p><a href="https://adventofcode.com/2022/day/3">Full Description</a></p>
<p>Today's problem is essentially a deduplication problem. Each input line is a series of letters. For part 1, we're deduplicating within lines, finding one character that is in both sides of the word. For part 2, we're dividing the inputs into groups of 3, and then finding the only letter common to all three strings.</p>
<p>To "answer the question", we have to provide a "score" for each of the unique characters. The lowercase letters get the scores 1-26. Uppercase letters get the scores 27-52. Then we'll take the sum of the scores from each line or group.</p>
<h2 id="solution-approach-and-insights">Solution Approach and Insights</h2>
<p>This is quite straightforward if you know your list library functions! We'll use <code>filter</code>, <code>elem</code>, <code>chunksOf</code> and <code>nub</code>!</p>
<h2 id="parsing-the-input">Parsing the Input</h2>
<p>Here's a sample input</p>
<pre><code>vJrwpWtwJgWrhcsFMMfFFhFp
jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL
PmmdzqPrVvPwwTWBwg
wMqvLMZHhHMvwLHjbvcjnnSBnvTQFn
ttgJtRGJQctTZtZT
CrZsJsPPZsGzwwsLwLmpwMDw
```:
Nothing tricky about the parsing code, since it's all just strings with only letters!
```haskell
parseInput :: (MonadLogger m) => ParsecT Void Text m InputType
parseInput = sepEndBy1 parseLine eol
type InputType = [LineType]
type LineType = String
parseLine :: (MonadLogger m) => ParsecT Void Text m LineType
parseLine = some letterChar</code></pre><h2 id="getting-the-solution">Getting the Solution</h2>
<p>We'll start with our scoring function. Of course, we'll use the <code>ord</code> function to turn each character into its ASCII number. By then we have to subtract the right amount so that lowercase 'a' (ASCII 97) gets a score of 1 and uppercase 'A' (ASCII 65) gets the score of 27:</p>
<pre><code class="lang-haskell">scoreChar :: Char -> Int
scoreChar c = if isUpper c
then ord c - 38
else ord c - 96</code></pre>
<p>The rest of the solution involves the same folding pattern from Day 2. As a reminder, here's the setup code (I'll omit this in future examples):</p>
<pre><code class="lang-haskell">solveFold :: (MonadLogger m) => [LineType] -> m EasySolutionType
solveFold = foldM foldLine initialFoldV
type FoldType = Int
initialFoldV :: FoldType
initialFoldV = 0
foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine = ...</code></pre>
<p>So the only challenge is filling out the folding function. First, we divide our word into the first half and the second half.</p>
<pre><code class="lang-haskell">foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine prevScore inputLine = ...
where
compartmentSize = length inputLine `quot` 2
(firstHalf, secondHalf) = splitAt compartmentSize inputLine</code></pre>
<p>Then we find the only character in both halves by filtering the first half based on being an <code>elem</code> of the second half. We also use <code>nub</code> to get rid of duplicates. We break this up with a case statement. If there's only one (as we expect), then we'll take its score and add it to the previous score. Otherwise we'll log an error message and return the previous score.</p>
<pre><code class="lang-haskell">foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine prevScore inputLine = do
case charsInBoth of
[c] -> return (prevScore + scoreChar c)
cs -> logErrorN ("Invalid chars in both sides! " <> (pack . show $ cs)) >> return prevScore
where
compartmentSize = length inputLine `quot` 2
(firstHalf, secondHalf) = splitAt compartmentSize inputLine
charsInBoth = nub $ filter (`elem` secondHalf) firstHalf</code></pre>
<p>And that's all for part 1!</p>
<h2 id="part-2">Part 2</h2>
<p>For part 2, we want to divide the input lines into groups of 3, and then find the common letter among them. Once again, we use a fold that starts with <code>chunksOf</code> to divide our input into groups of 3.</p>
<pre><code class="lang-haskell">processInputHard :: (MonadLogger m) => InputType -> m HardSolutionType
processInputHard allLines = foldM foldHard 0 (chunksOf 3 allLines)
foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard = ...</code></pre>
<p>With this function, we first make sure we have exactly 3 strings.</p>
<pre><code class="lang-haskell">foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard prevScore [s1, s2, s3] = ...
foldHard prevScore inputs = logErrorN ("Invalid inputs (should be size 3) " <> (pack . show $ inputs)) >> return prevScore</code></pre>
<p>Now for the primary case, we do the same thing as before, only we filter <code>s1</code> based on <code>s2</code>. Then we filter that result with <code>s3</code> and do the same <code>nub</code> trick.</p>
<pre><code class="lang-haskell">foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard prevScore [s1, s2, s3] = ...
where
s1AndS2 = filter (`elem` s2) s1
all3 = nub $ filter (`elem` s3) s1AndS2</code></pre>
<p>And we conclude with the same process as before. Log an error if we don't get the right outputs, otherwise add the score for the character.</p>
<pre><code class="lang-haskell">foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard prevScore [s1, s2, s3] = do
case all3 of
[c] -> logErrorN ("Found " <> (pack [c]) <> " with score " <> (pack . show $ scoreChar c)) >> return (prevScore + scoreChar c)
cs -> logErrorN ("Invalid chars in all 3 ! " <> (pack . show $ cs)) >> return prevScore
where
s1AndS2 = filter (`elem` s2) s1
all3 = nub $ filter (`elem` s3) s1AndS2</code></pre>
<h2 id="answering-the-question">Answering the Question</h2>
<p>As with the past couple days, we don't have any more work to do after processing the input:</p>
<pre><code class="lang-haskell">solveEasy :: FilePath -> IO (Maybe Int)
solveEasy fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputEasy input
solveHard :: FilePath -> IO (Maybe Int)
solveHard fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputHard input</code></pre>
<p>And this gives us our answer!</p>
<h2 id="video">Video</h2>
<p><a href="https://youtu.be/f4vzL9n_zhs">YouTube Link</a></p>Sat, 03 Dec 2022 16:00:00 +0000<div class="sqs-block-content" morss_own_score="5.778006166495375" morss_score="113.27800616649537"><p><a href="https://github.com/MondayMorningHaskell/AdventOfCode/blob/aoc-2022/src/Day3.hs">Solution code on GitHub</a></p>
<p><a href="https://www.mmhaskell.com/advent-of-code-2022">All 2022 Problems</a></p>
<p><a href="https://www.mmhaskell.com/subscribe">Subscribe to Monday Morning Haskell</a>!</p>
<h2>Problem Overview</h2>
<p><a href="https://adventofcode.com/2022/day/3">Full Description</a></p>
<p>Today's problem is essentially a deduplication problem. Each input line is a series of letters. For part 1, we're deduplicating within lines, finding one character that is in both sides of the word. For part 2, we're dividing the inputs into groups of 3, and then finding the only letter common to all three strings.</p>
<p>To "answer the question", we have to provide a "score" for each of the unique characters. The lowercase letters get the scores 1-26. Uppercase letters get the scores 27-52. Then we'll take the sum of the scores from each line or group.</p>
<h2>Solution Approach and Insights</h2>
<p>This is quite straightforward if you know your list library functions! We'll use <code>filter</code>, <code>elem</code>, <code>chunksOf</code> and <code>nub</code>!</p>
<h2>Parsing the Input</h2>
<p>Here's a sample input</p>
<pre><code>vJrwpWtwJgWrhcsFMMfFFhFp
jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL
PmmdzqPrVvPwwTWBwg
wMqvLMZHhHMvwLHjbvcjnnSBnvTQFn
ttgJtRGJQctTZtZT
CrZsJsPPZsGzwwsLwLmpwMDw
```:
Nothing tricky about the parsing code, since it's all just strings with only letters!
```haskell
parseInput :: (MonadLogger m) => ParsecT Void Text m InputType
parseInput = sepEndBy1 parseLine eol
type InputType = [LineType]
type LineType = String
parseLine :: (MonadLogger m) => ParsecT Void Text m LineType
parseLine = some letterChar</code></pre><h2>Getting the Solution</h2>
<p>We'll start with our scoring function. Of course, we'll use the <code>ord</code> function to turn each character into its ASCII number. By then we have to subtract the right amount so that lowercase 'a' (ASCII 97) gets a score of 1 and uppercase 'A' (ASCII 65) gets the score of 27:</p>
<pre><code>scoreChar :: Char -> Int
scoreChar c = if isUpper c
then ord c - 38
else ord c - 96</code></pre>
<p>The rest of the solution involves the same folding pattern from Day 2. As a reminder, here's the setup code (I'll omit this in future examples):</p>
<pre><code>solveFold :: (MonadLogger m) => [LineType] -> m EasySolutionType
solveFold = foldM foldLine initialFoldV
type FoldType = Int
initialFoldV :: FoldType
initialFoldV = 0
foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine = ...</code></pre>
<p>So the only challenge is filling out the folding function. First, we divide our word into the first half and the second half.</p>
<pre><code>foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine prevScore inputLine = ...
where
compartmentSize = length inputLine `quot` 2
(firstHalf, secondHalf) = splitAt compartmentSize inputLine</code></pre>
<p>Then we find the only character in both halves by filtering the first half based on being an <code>elem</code> of the second half. We also use <code>nub</code> to get rid of duplicates. We break this up with a case statement. If there's only one (as we expect), then we'll take its score and add it to the previous score. Otherwise we'll log an error message and return the previous score.</p>
<pre><code>foldLine :: (MonadLogger m) => FoldType -> LineType -> m FoldType
foldLine prevScore inputLine = do
case charsInBoth of
[c] -> return (prevScore + scoreChar c)
cs -> logErrorN ("Invalid chars in both sides! " <> (pack . show $ cs)) >> return prevScore
where
compartmentSize = length inputLine `quot` 2
(firstHalf, secondHalf) = splitAt compartmentSize inputLine
charsInBoth = nub $ filter (`elem` secondHalf) firstHalf</code></pre>
<p>And that's all for part 1!</p>
<h2>Part 2</h2>
<p>For part 2, we want to divide the input lines into groups of 3, and then find the common letter among them. Once again, we use a fold that starts with <code>chunksOf</code> to divide our input into groups of 3.</p>
<pre><code>processInputHard :: (MonadLogger m) => InputType -> m HardSolutionType
processInputHard allLines = foldM foldHard 0 (chunksOf 3 allLines)
foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard = ...</code></pre>
<p>With this function, we first make sure we have exactly 3 strings.</p>
<pre><code>foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard prevScore [s1, s2, s3] = ...
foldHard prevScore inputs = logErrorN ("Invalid inputs (should be size 3) " <> (pack . show $ inputs)) >> return prevScore</code></pre>
<p>Now for the primary case, we do the same thing as before, only we filter <code>s1</code> based on <code>s2</code>. Then we filter that result with <code>s3</code> and do the same <code>nub</code> trick.</p>
<pre><code>foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard prevScore [s1, s2, s3] = ...
where
s1AndS2 = filter (`elem` s2) s1
all3 = nub $ filter (`elem` s3) s1AndS2</code></pre>
<p>And we conclude with the same process as before. Log an error if we don't get the right outputs, otherwise add the score for the character.</p>
<pre><code>foldHard :: (MonadLogger m) => Int -> [String] -> m Int
foldHard prevScore [s1, s2, s3] = do
case all3 of
[c] -> logErrorN ("Found " <> (pack [c]) <> " with score " <> (pack . show $ scoreChar c)) >> return (prevScore + scoreChar c)
cs -> logErrorN ("Invalid chars in all 3 ! " <> (pack . show $ cs)) >> return prevScore
where
s1AndS2 = filter (`elem` s2) s1
all3 = nub $ filter (`elem` s3) s1AndS2</code></pre>
<h2>Answering the Question</h2>
<p>As with the past couple days, we don't have any more work to do after processing the input:</p>
<pre><code>solveEasy :: FilePath -> IO (Maybe Int)
solveEasy fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputEasy input
solveHard :: FilePath -> IO (Maybe Int)
solveHard fp = runStdoutLoggingT $ do
input <- parseFile parseInput fp
Just <$> processInputHard input</code></pre>
<p>And this gives us our answer!</p>
<h2>Video</h2>
<p><a href="https://youtu.be/f4vzL9n_zhs">YouTube Link</a></p>
</div>Brent Yorgey: Competitive programming in Haskell: Infinite 2D array, Level 1http://byorgey.wordpress.com/?p=2470https://byorgey.wordpress.com/2022/12/03/competitive-programming-in-haskell-infinite-2d-array-level-1/<p>In my <a href="https://byorgey.wordpress.com/2022/09/01/competitive-programming-in-haskell-infinite-2d-array/">previous post</a>, I challenged you to solve <a href="https://open.kattis.com/problems/infinite2darray">Infinite 2D Array</a> using Haskell. As a reminder, the problem specifies a two-parameter recurrence <img alt="F_{x,y}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002" />, given by</p>
<ul>
<li><img alt="F_{0,0} = 0" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7B0%2C0%7D+%3D+0&bg=ffffff&fg=333333&s=0&c=20201002" /></li>
<li><img alt="F_{0,1} = F_{1,0} = 1" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7B0%2C1%7D+%3D+F_%7B1%2C0%7D+%3D+1&bg=ffffff&fg=333333&s=0&c=20201002" /></li>
<li><img alt="F_{i,0} = F_{i-1,0} + F_{i-2,0}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bi%2C0%7D+%3D+F_%7Bi-1%2C0%7D+%2B+F_%7Bi-2%2C0%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> for <img alt="i \geq 2" class="latex" src="https://s0.wp.com/latex.php?latex=i+%5Cgeq+2&bg=ffffff&fg=333333&s=0&c=20201002" /></li>
<li><img alt="F_{0,i} = F_{0,i-1} + F_{0,i-2}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7B0%2Ci%7D+%3D+F_%7B0%2Ci-1%7D+%2B+F_%7B0%2Ci-2%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> for <img alt="i \geq 2" class="latex" src="https://s0.wp.com/latex.php?latex=i+%5Cgeq+2&bg=ffffff&fg=333333&s=0&c=20201002" /></li>
<li><img alt="F_{i,j} = F_{i-1,j} + F_{i,j-1}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bi%2Cj%7D+%3D+F_%7Bi-1%2Cj%7D+%2B+F_%7Bi%2Cj-1%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> for <img alt="i,j \geq 1" class="latex" src="https://s0.wp.com/latex.php?latex=i%2Cj+%5Cgeq+1&bg=ffffff&fg=333333&s=0&c=20201002" />.</li>
</ul>
<p>We are given particular values of <img alt="x" class="latex" src="https://s0.wp.com/latex.php?latex=x&bg=ffffff&fg=333333&s=0&c=20201002" /> and <img alt="y" class="latex" src="https://s0.wp.com/latex.php?latex=y&bg=ffffff&fg=333333&s=0&c=20201002" />, and asked to compute <img alt="F_{x,y} \bmod (10^9 + 7)" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D+%5Cbmod+%2810%5E9+%2B+7%29&bg=ffffff&fg=333333&s=0&c=20201002" />. The problem is that <img alt="x" class="latex" src="https://s0.wp.com/latex.php?latex=x&bg=ffffff&fg=333333&s=0&c=20201002" /> and <img alt="y" class="latex" src="https://s0.wp.com/latex.php?latex=y&bg=ffffff&fg=333333&s=0&c=20201002" /> could be as large as <img alt="10^6" class="latex" src="https://s0.wp.com/latex.php?latex=10%5E6&bg=ffffff&fg=333333&s=0&c=20201002" />, so simply computing the entire <img alt="x \times y" class="latex" src="https://s0.wp.com/latex.php?latex=x+%5Ctimes+y&bg=ffffff&fg=333333&s=0&c=20201002" /> array is completely out of the question: it would take almost 4 <em>terabytes</em> of memory to store a <img alt="10^6 \times 10^6" class="latex" src="https://s0.wp.com/latex.php?latex=10%5E6+%5Ctimes+10%5E6&bg=ffffff&fg=333333&s=0&c=20201002" /> array of 32-bit integer values. In this post, I’ll answer the Level 1 challenge: coming up with a general formula for <img alt="F_{x,y}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002" />.</p>
<p>We need to be more clever about computing a given <img alt="F_{x,y}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> without computing every entry in the entire 2D array, so we look for some patterns. It’s pretty obvious that the array has Fibonacci numbers along both the top two rows and the first two columns, though it’s sadly just as obvious that we don’t get Fibonacci numbers anywhere else. The last rule, the rule that determines the interior entries, says that each interior cell is the sum of the cell above it and the cell to the left. This looks a lot like the rule for generating Pascal’s triangle, <em>i.e.</em> binomial coefficients; in fact, if the first row and column were specified to be all 1’s instead of Fibonacci numbers, then we would get exactly binomial coefficients.</p>
<p>I knew that binomial coefficients can also be thought of as counting <a href="http://discrete.openmathbooks.org/dmoi2/sec_counting-binom.html">the number of paths from one point in a grid to another which can only take east or south steps</a>, and this finally gave me the right insight. Each interior cell is a sum of other cells, which are themselves sums of other cells, and so on until we get to the edges, and so ultimately each interior cell can be thought of as a sum of a bunch of copies of numbers on the edges, <em>i.e.</em> Fibonacci numbers. How many copies? Well, the number of times each Fibonacci number on an edge contributes to a particular interior cell is equal to the number of paths from the Fibonacci number to the interior cell (with the restriction that the paths’ first step must immediately be into the interior of the grid, instead of taking a step along the first row or column). For example, consider <img alt="F_{3,2} = 11" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7B3%2C2%7D+%3D+11&bg=ffffff&fg=333333&s=0&c=20201002" />. The two 1’s along the top row contribute 3 times and 1 time, respectively, whereas the 1’s and 2 along the first column contribute 3 times, 2 times, and once, respectively, for a total of <img alt="11" class="latex" src="https://s0.wp.com/latex.php?latex=11&bg=ffffff&fg=333333&s=0&c=20201002" />:</p>
<div style="text-align: center;">
<p><img src="https://byorgey.files.wordpress.com/2022/12/331a85cd2f470b8c.png?w=640" /></p>
</div>
<p>The number of paths from <img alt="F_{0,k}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7B0%2Ck%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> to <img alt="F_{x,y}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> is the number of grid paths from <img alt="(1,k)" class="latex" src="https://s0.wp.com/latex.php?latex=%281%2Ck%29&bg=ffffff&fg=333333&s=0&c=20201002" /> to <img alt="(x,y)" class="latex" src="https://s0.wp.com/latex.php?latex=%28x%2Cy%29&bg=ffffff&fg=333333&s=0&c=20201002" />, which is <img alt="\binom{(x-1) + (y-k)}{y-k}" class="latex" src="https://s0.wp.com/latex.php?latex=%5Cbinom%7B%28x-1%29+%2B+%28y-k%29%7D%7By-k%7D&bg=ffffff&fg=333333&s=0&c=20201002" />. Likewise the number of paths from <img alt="F_{k,0}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bk%2C0%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> to <img alt="F_{x,y}" class="latex" src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002" /> is <img alt="\binom{(x-k) + (y-1)}{x-k}" class="latex" src="https://s0.wp.com/latex.php?latex=%5Cbinom%7B%28x-k%29+%2B+%28y-1%29%7D%7Bx-k%7D&bg=ffffff&fg=333333&s=0&c=20201002" />. All together, this yields the formula</p>
<p><img alt="\displaystyle F_{x,y} = \left(\sum_{1 \leq k \leq x} F_k \binom{x-k+y-1}{x-k}\right) + \left(\sum_{1 \leq k \leq y} F_k \binom{y-k+x-1}{y-k}\right) \pmod{P}" class="latex" src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+F_%7Bx%2Cy%7D+%3D+%5Cleft%28%5Csum_%7B1+%5Cleq+k+%5Cleq+x%7D+F_k+%5Cbinom%7Bx-k%2By-1%7D%7Bx-k%7D%5Cright%29+%2B+%5Cleft%28%5Csum_%7B1+%5Cleq+k+%5Cleq+y%7D+F_k+%5Cbinom%7By-k%2Bx-1%7D%7By-k%7D%5Cright%29+%5Cpmod%7BP%7D&bg=ffffff&fg=333333&s=0&c=20201002" /></p>
<p>Commenter <a href="https://byorgey.wordpress.com/2022/09/01/competitive-programming-in-haskell-infinite-2d-array/#comment-40784">Soumik Sarkar found a different formula</a>,</p>
<p><img alt="\displaystyle F_{x,y} = F_{x+2y} + \sum_{1 \leq k \leq y} (F_k - F_{2k}) \binom{y-k+x-1}{y-k} \pmod{P}" class="latex" src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+F_%7Bx%2Cy%7D+%3D+F_%7Bx%2B2y%7D+%2B+%5Csum_%7B1+%5Cleq+k+%5Cleq+y%7D+%28F_k+-+F_%7B2k%7D%29+%5Cbinom%7By-k%2Bx-1%7D%7By-k%7D+%5Cpmod%7BP%7D&bg=ffffff&fg=333333&s=0&c=20201002" /></p>
<p>which clearly has some similarity to mine, but I have not been able to figure out how to derive it, and Soumik did not explain how they found it. Any insights welcome!</p>
<p>In any case, both of these formulas involve a sum of only <img alt="O(x+y)" class="latex" src="https://s0.wp.com/latex.php?latex=O%28x%2By%29&bg=ffffff&fg=333333&s=0&c=20201002" /> terms, instead of <img alt="O(xy)" class="latex" src="https://s0.wp.com/latex.php?latex=O%28xy%29&bg=ffffff&fg=333333&s=0&c=20201002" />, although the individual terms are going to be much more work to compute. The question now becomes how to efficiently compute Fibonacci numbers and binomial coefficients modulo a prime. I’ll talk about that in the next post!</p>Sat, 03 Dec 2022 11:43:08 +0000<div id="post-2470" class="post-2470 post type-post status-publish format-standard hentry category-competitive-programming category-haskell tag-haskell tag-kattis tag-number" morss_own_score="4.846846846846847" morss_score="9.1640958762423">
<h2><a href="https://byorgey.wordpress.com/2022/12/03/competitive-programming-in-haskell-infinite-2d-array-level-1/">Competitive programming in Haskell: Infinite 2D array, Level 1</a></h2>
<a href="https://byorgey.wordpress.com/2022/12/03/competitive-programming-in-haskell-infinite-2d-array-level-1/" title="7:43 am"><span>December 3, 2022</span></a> <span>by</span> <a href="https://byorgey.wordpress.com/author/byorgey/" title="View all posts by Brent">Brent</a>
<div class="entry-content" morss_own_score="5.3011647254575704" morss_score="30.245890658207458">
<p>In my <a href="https://byorgey.wordpress.com/2022/09/01/competitive-programming-in-haskell-infinite-2d-array/">previous post</a>, I challenged you to solve <a href="https://open.kattis.com/problems/infinite2darray">Infinite 2D Array</a> using Haskell. As a reminder, the problem specifies a two-parameter recurrence <img src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002">, given by</p>
<ul>
<li><img src="https://s0.wp.com/latex.php?latex=F_%7B0%2C0%7D+%3D+0&bg=ffffff&fg=333333&s=0&c=20201002"></li>
<li><img src="https://s0.wp.com/latex.php?latex=F_%7B0%2C1%7D+%3D+F_%7B1%2C0%7D+%3D+1&bg=ffffff&fg=333333&s=0&c=20201002"></li>
<li><img src="https://s0.wp.com/latex.php?latex=F_%7Bi%2C0%7D+%3D+F_%7Bi-1%2C0%7D+%2B+F_%7Bi-2%2C0%7D&bg=ffffff&fg=333333&s=0&c=20201002"> for <img src="https://s0.wp.com/latex.php?latex=i+%5Cgeq+2&bg=ffffff&fg=333333&s=0&c=20201002"></li>
<li><img src="https://s0.wp.com/latex.php?latex=F_%7B0%2Ci%7D+%3D+F_%7B0%2Ci-1%7D+%2B+F_%7B0%2Ci-2%7D&bg=ffffff&fg=333333&s=0&c=20201002"> for <img src="https://s0.wp.com/latex.php?latex=i+%5Cgeq+2&bg=ffffff&fg=333333&s=0&c=20201002"></li>
<li><img src="https://s0.wp.com/latex.php?latex=F_%7Bi%2Cj%7D+%3D+F_%7Bi-1%2Cj%7D+%2B+F_%7Bi%2Cj-1%7D&bg=ffffff&fg=333333&s=0&c=20201002"> for <img src="https://s0.wp.com/latex.php?latex=i%2Cj+%5Cgeq+1&bg=ffffff&fg=333333&s=0&c=20201002">.</li>
</ul>
<p morss_own_score="7.0" morss_score="9.0">We are given particular values of <img src="https://s0.wp.com/latex.php?latex=x&bg=ffffff&fg=333333&s=0&c=20201002"> and <img src="https://s0.wp.com/latex.php?latex=y&bg=ffffff&fg=333333&s=0&c=20201002">, and asked to compute <img src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D+%5Cbmod+%2810%5E9+%2B+7%29&bg=ffffff&fg=333333&s=0&c=20201002">. The problem is that <img src="https://s0.wp.com/latex.php?latex=x&bg=ffffff&fg=333333&s=0&c=20201002"> and <img src="https://s0.wp.com/latex.php?latex=y&bg=ffffff&fg=333333&s=0&c=20201002"> could be as large as <img src="https://s0.wp.com/latex.php?latex=10%5E6&bg=ffffff&fg=333333&s=0&c=20201002">, so simply computing the entire <img src="https://s0.wp.com/latex.php?latex=x+%5Ctimes+y&bg=ffffff&fg=333333&s=0&c=20201002"> array is completely out of the question: it would take almost 4 <em>terabytes</em> of memory to store a <img src="https://s0.wp.com/latex.php?latex=10%5E6+%5Ctimes+10%5E6&bg=ffffff&fg=333333&s=0&c=20201002"> array of 32-bit integer values. In this post, I’ll answer the Level 1 challenge: coming up with a general formula for <img src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002">.</p>
<p morss_own_score="7.0" morss_score="9.0">We need to be more clever about computing a given <img src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002"> without computing every entry in the entire 2D array, so we look for some patterns. It’s pretty obvious that the array has Fibonacci numbers along both the top two rows and the first two columns, though it’s sadly just as obvious that we don’t get Fibonacci numbers anywhere else. The last rule, the rule that determines the interior entries, says that each interior cell is the sum of the cell above it and the cell to the left. This looks a lot like the rule for generating Pascal’s triangle, <em>i.e.</em> binomial coefficients; in fact, if the first row and column were specified to be all 1’s instead of Fibonacci numbers, then we would get exactly binomial coefficients.</p>
<p morss_own_score="5.658682634730539" morss_score="7.658682634730539">I knew that binomial coefficients can also be thought of as counting <a href="http://discrete.openmathbooks.org/dmoi2/sec_counting-binom.html">the number of paths from one point in a grid to another which can only take east or south steps</a>, and this finally gave me the right insight. Each interior cell is a sum of other cells, which are themselves sums of other cells, and so on until we get to the edges, and so ultimately each interior cell can be thought of as a sum of a bunch of copies of numbers on the edges, <em>i.e.</em> Fibonacci numbers. How many copies? Well, the number of times each Fibonacci number on an edge contributes to a particular interior cell is equal to the number of paths from the Fibonacci number to the interior cell (with the restriction that the paths’ first step must immediately be into the interior of the grid, instead of taking a step along the first row or column). For example, consider <img src="https://s0.wp.com/latex.php?latex=F_%7B3%2C2%7D+%3D+11&bg=ffffff&fg=333333&s=0&c=20201002">. The two 1’s along the top row contribute 3 times and 1 time, respectively, whereas the 1’s and 2 along the first column contribute 3 times, 2 times, and once, respectively, for a total of <img src="https://s0.wp.com/latex.php?latex=11&bg=ffffff&fg=333333&s=0&c=20201002">:</p>
<p><img src="https://byorgey.files.wordpress.com/2022/12/331a85cd2f470b8c.png?w=640"></p>
<p>The number of paths from <img src="https://s0.wp.com/latex.php?latex=F_%7B0%2Ck%7D&bg=ffffff&fg=333333&s=0&c=20201002"> to <img src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002"> is the number of grid paths from <img src="https://s0.wp.com/latex.php?latex=%281%2Ck%29&bg=ffffff&fg=333333&s=0&c=20201002"> to <img src="https://s0.wp.com/latex.php?latex=%28x%2Cy%29&bg=ffffff&fg=333333&s=0&c=20201002">, which is <img src="https://s0.wp.com/latex.php?latex=%5Cbinom%7B%28x-1%29+%2B+%28y-k%29%7D%7By-k%7D&bg=ffffff&fg=333333&s=0&c=20201002">. Likewise the number of paths from <img src="https://s0.wp.com/latex.php?latex=F_%7Bk%2C0%7D&bg=ffffff&fg=333333&s=0&c=20201002"> to <img src="https://s0.wp.com/latex.php?latex=F_%7Bx%2Cy%7D&bg=ffffff&fg=333333&s=0&c=20201002"> is <img src="https://s0.wp.com/latex.php?latex=%5Cbinom%7B%28x-k%29+%2B+%28y-1%29%7D%7Bx-k%7D&bg=ffffff&fg=333333&s=0&c=20201002">. All together, this yields the formula</p>
<p><img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+F_%7Bx%2Cy%7D+%3D+%5Cleft%28%5Csum_%7B1+%5Cleq+k+%5Cleq+x%7D+F_k+%5Cbinom%7Bx-k%2By-1%7D%7Bx-k%7D%5Cright%29+%2B+%5Cleft%28%5Csum_%7B1+%5Cleq+k+%5Cleq+y%7D+F_k+%5Cbinom%7By-k%2Bx-1%7D%7By-k%7D%5Cright%29+%5Cpmod%7BP%7D&bg=ffffff&fg=333333&s=0&c=20201002"></p>
<p>Commenter <a href="https://byorgey.wordpress.com/2022/09/01/competitive-programming-in-haskell-infinite-2d-array/#comment-40784">Soumik Sarkar found a different formula</a>,</p>
<p><img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+F_%7Bx%2Cy%7D+%3D+F_%7Bx%2B2y%7D+%2B+%5Csum_%7B1+%5Cleq+k+%5Cleq+y%7D+%28F_k+-+F_%7B2k%7D%29+%5Cbinom%7By-k%2Bx-1%7D%7By-k%7D+%5Cpmod%7BP%7D&bg=ffffff&fg=333333&s=0&c=20201002"></p>
<p>which clearly has some similarity to mine, but I have not been able to figure out how to derive it, and Soumik did not explain how they found it. Any insights welcome!</p>
<p morss_own_score="7.0" morss_score="7.0">In any case, both of these formulas involve a sum of only <img src="https://s0.wp.com/latex.php?latex=O%28x%2By%29&bg=ffffff&fg=333333&s=0&c=20201002"> terms, instead of <img src="https://s0.wp.com/latex.php?latex=O%28xy%29&bg=ffffff&fg=333333&s=0&c=20201002">, although the individual terms are going to be much more work to compute. The question now becomes how to efficiently compute Fibonacci numbers and binomial coefficients modulo a prime. I’ll talk about that in the next post!</p>
<div>Advertisement</div>
<span>Privacy Settings</span>
</div>
<img src="https://0.gravatar.com/avatar/cc113924265dbeb535c8b2fefe4e33ee?s=60&d=identicon&r=G">
<div>
<h2>
About Brent </h2>
Associate Professor of Computer Science at Hendrix College. Functional programmer, mathematician, teacher, pianist, follower of Jesus.
<a href="https://byorgey.wordpress.com/author/byorgey/">
View all posts by Brent <span>→</span> </a>
</div>
<div>
This entry was posted in <a href="https://byorgey.wordpress.com/category/competitive-programming/">competitive programming</a>, <a href="https://byorgey.wordpress.com/tag/haskell/">haskell</a> and tagged <a href="https://byorgey.wordpress.com/tag/haskell/">haskell</a>, <a href="https://byorgey.wordpress.com/tag/kattis/">Kattis</a>, <a href="https://byorgey.wordpress.com/tag/number/">number</a>. Bookmark the <a href="https://byorgey.wordpress.com/2022/12/03/competitive-programming-in-haskell-infinite-2d-array-level-1/" title="Permalink to Competitive programming in Haskell: Infinite 2D array, Level 1">permalink</a>. </div>
</div>