Find All Cities That Have No Photos

Last week I noticed quite good example to illustrate the problem of finding records with empty association, i.e. when particular has_many association returns zero records.

In this post I’d love to go a little bit further into proposed solution and figure out why it actually works.

Following original question, let’s say we’ve two classes:

city.rb
1
2
3
class City < ActiveRecord::Base
  has_many :photos
end
photo.rb
1
2
3
class Photo < ActiveRecord::Base
  belongs_to :city
end

And while solution is easy:

1
City.includes(:photos).where(photos: { city_id: nil })

It might not feel quite obvious, for example, some peoples critiques:

Isn’t it looking for photos that don’t have a city_id? That’s not the same as cities for which there is no photo with that particular city’s id as the foreign key

In order to figure that out let’s refer to the documentation:

http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html

According to which, conditions or orders that references the tables other than the main one, cause Active Record to fall back to LEFT OUTER JOIN based strategy.

For example:

1
Post.includes([:author, :comments]).where(['comments.approved = ?', true])

This will result in a single SQL query with joins along the lines of:

1
LEFT OUTER JOIN comments ON comments.post_id = posts.id and LEFT OUTER JOIN authors ON authors.id = posts.author_id

Or in our case with cities and photos:

1
LEFT OUTER JOIN photos ON photos.city_id = city.id WHERE photos.city_id IS NULL

The key thing here is the LEFT OUTER JOIN which, after the “normal” join, adds remaining records from cities which are not in the result set and placing NULLs for all the fields of photos.

Let’s see by example:

Cities

id name
1 London
2 Paris
3 Berlin

Photos

city_id file_name
1 bridge.jpg
2 eiffel.jpg

1st Step: Inner Join

id name city_id file_name
1 London 1 bridge.jpg
2 Paris 2 eiffel.jpg

2nd Step: Left Outer Join

id name city_id file_name
1 London 1 bridge.jpg
2 Paris 2 eiffel.jpg
3 Berlin null null

Resulting Table: WHERE clause

id name city_id file_name
3 Berlin null null

And that’s exactly what we’re looking for :)

Conclusion

Good article explaining outer joins: http://www.postgresqlforbeginners.com/2010/11/sql-outer-joins.html

Comments