And then, there was Day 11. We are supposed to look for a stabilized layout of seat occupation when all of the people follow certain rules, depending on how many seats around them are already occupied.

This is Conway’s Game of Life with a little addition: Tiles where no seat exists.

Since I wanted to try out the Oracle MODEL clause for a while, I thought it couldn’t hurt to do this challenge with the special toolkit it provides. And it even worked pretty well, using the small input of the example.

Another benefit was that I could use Kim Berg Hansens awesome book “Practical Oracle SQL” which covers exactly that Game of Life example with the MODEL clause.

In hope to improve the performance overhead, I started with normalizing the data to get a vertical set of row_id, col_id data:

create table aoc_day11_normalized as
with
  base_data as (
    select
      rownum row_id,
      column_value line
    from table (
        aoc_file_loader.file_as_stringlist(
          aoc_file_loader.local_url('day_11_input.txt')
       )
    )
  ),
  cols as (
    select
      level col_id
    from dual
    connect by level <= (select length(line) from base_data where row_id = 1)
  ),
  column_data as (
    select
      base_data.row_id,
      cols.col_id,
      substr(line, cols.col_id, 1) seat
    from base_data
      cross join cols
  )
select * from column_data;

With that, I tried the approach with MODEL, first even without caring for change detection (we want to stop iterating once no changes occur):

with
  column_data as (
    select
      row_id,
      col_id,
      seat
    from aoc_day11_normalized
  ),
  modeled_data as (
    select
      *
    from column_data
    model
    dimension by (
      0 generation,
      row_id,
      col_id
    )
    measures (
      seat,
      0 as sum_occupied,
      0 as nb_occupied
    )
    ignore nav
    rules upsert all iterate(1) (
      sum_occupied[iteration_number, any, any] =
        sum(
          case seat
            when 'L' then 0
            when '#' then 1
            else null
          end
        )[
          generation = iteration_number,
          row_id between cv()-1 and cv()+1,
          col_id between cv()-1 and cv()+1
        ],
      nb_occupied[iteration_number, any, any] =
        sum_occupied[iteration_number, cv(), cv()]
        - case seat[iteration_number, cv(), cv()] when '#' then 1 else 0 end
      ,
      seat[iteration_number+1, any, any] =
        case
          when seat[iteration_number, cv(), cv()] = '.' then
            '.'
          when seat[iteration_number, cv(), cv()] = 'L'
            and nb_occupied[iteration_number, cv(), cv()] = 0 then
            '#'
          when seat[iteration_number, cv(), cv()] = '#'
            and nb_occupied[iteration_number, cv(), cv()] >= 4 then
            'L'
          else seat[iteration_number, cv(), cv()]
        end
    )
  ),
  display as (
    select
      generation,
      row_id,
      listagg(seat) within group(order by col_id) seats,
      sum(case seat when '#' then 1 else 0 end) occupied_seats
    from modeled_data
    group by generation, rollup(row_id)
  )
select
  generation, row_id, col_id, seat
from modeled_data

That worked pretty well with the tiny input-data of the example – but one iteration alone on my real input (92×92 cells) took half a minute – and it got worse the more iterations I added.

After putting some serious but fruitless effort into that solution, I did what a knowledge worker does if they hit a problem they cannot solve: I asked for help and contacted Kim as an expert who even wrote about that MODEL clause.

As he is an incredibly nice guy, he answered immediately and even came up with a blog post, pointing out that aggregate functions seem to significantly hurt the performance.

So my final solution to Part 1 runs now in about 2 minutes, needing 96 iterations:

with
  column_data as (
    select
      row_id,
      col_id,
      seat,
      case seat
        when '#' then 1
        else 0
      end occupied
    from aoc_day11_normalized
    --where row_id <= 10
  ),
  modeled_data as (
    select
      *
    from column_data
    model
    dimension by (
      0 generation,
      row_id,
      col_id
    )
    measures (
      seat,
      occupied,
      0 as sum_occupied,
      0 as nb_occupied,
      0 as changes,
      1 as sum_changes
    )
    ignore nav
    rules upsert all iterate(200) until (iteration_number > 1 and sum_changes[iteration_number, 1, 1] <= 0) (
      occupied[iteration_number, any, any] =
        case seat[iteration_number, cv(), cv()]
          when '#' then 1
          else 0
        end ,
      nb_occupied[iteration_number, any, any] =
        occupied[iteration_number, cv()-1, cv()-1]
        + occupied[iteration_number, cv()-1, cv()]
        + occupied[iteration_number, cv()-1, cv()+1]
        + occupied[iteration_number, cv(), cv()-1]
        + occupied[iteration_number, cv(), cv()+1]
        + occupied[iteration_number, cv()+1, cv()-1]
        + occupied[iteration_number, cv()+1, cv()]
        + occupied[iteration_number, cv()+1, cv()+1]
      ,
      seat[iteration_number+1, any, any] =
        case
          when seat[iteration_number, cv(), cv()] = '.' then
            '.'
          when seat[iteration_number, cv(), cv()] = 'L'
            and nb_occupied[iteration_number, cv(), cv()] = 0 then
            '#'
          when seat[iteration_number, cv(), cv()] = '#'
            and nb_occupied[iteration_number, cv(), cv()] >= 4 then
            'L'
          else seat[iteration_number, cv(), cv()]
        end,
      changes[iteration_number+1, any, any] =
        case when seat[iteration_number, cv(), cv()] != seat[iteration_number+1, cv(), cv()] then 1
        else 0 end,
      sum_changes[iteration_number+1,1,1] = sum(changes)[iteration_number+1, any, any]
    )
  ),
  display as (
    select
      generation,
      row_id,
      sum(changes) changes,
      sum(case seat when '#' then 1 else 0 end) occupied_seats
    from modeled_data
    group by generation, rollup(row_id)
  )
select
  *
from display
where row_id is null
order by generation desc
fetch first row only;

Let’s be honest – 2 minutes are still not great. But it’s bearable. And it’s completely done in SQL.

For part 2, however, I decided to use PL/SQL. Algorithms like these is where SQL hits its limit – it’s just not the right tool to solve it, given that a solution in a procedural language is not that difficult.

First, I re-coded part 1, which was straight forward. To hold the data I used a varray(100) of varrays(100) of integer, basically a 2-dimensional array.

The main logic is in the functions iterate_cell_part1 and especially is_occupied:

  function is_occupied( y integer, x integer, i_array in t_rows ) return integer as
  begin
    if y <= 0 or y > i_array.count or x <= 0 or x > i_array(y).count then
      return 0;
    end if;
    return case i_array(y)(x) when '#' then 1 else 0 end;
  end;
  function iterate_cell_part1( y integer, x integer, i_array in t_rows ) return char as
    l_neighbours integer;
  begin
    if i_array(y)(x) = '.' then
      return '.';
    end if;
    l_neighbours :=
        is_occupied(y-1, x-1, i_array)
      + is_occupied(y-1, x  , i_array)
      + is_occupied(y-1, x+1, i_array)
      + is_occupied(y  , x-1, i_array)
      + is_occupied(y  , x+1, i_array)
      + is_occupied(y+1, x-1, i_array)
      + is_occupied(y+1, x  , i_array)
      + is_occupied(y+1, x+1, i_array);
    if i_array(y)(x) = 'L' and l_neighbours = 0 then
      return '#';
    elsif i_array(y)(x) = '#' and l_neighbours >= 4 then
      return 'L';
    else
      return i_array(y)(x);
    end if;
  end;

To solve part 2, all I had to do was to change is_occupied to sees_occupied – the rest is pretty much the same:

  function sees_occupied( y integer, x integer, ystep integer, xstep integer, i_array in t_rows ) 
  return integer as
    l_nextY integer := y + ystep;
    l_nextX integer := x + xstep;
  begin
    if l_nextY <= 0 or l_nextY > i_array.count or l_nextX <= 0 or l_nextX > i_array(y).count then
      return 0;
    end if;
    return case i_array(l_nextY)(l_nextX)
      when '.' then sees_occupied(l_nextY, l_nextX, ystep, xstep, i_array)
      when '#' then 1
      else 0 end;
  end;

The nice thing: I got my result in 7 seconds for part 1 and 13 seconds for part 2.

You can see the whole solution my github-repository.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.