All talk but no code...

makes lulalala a bluff boy

Gem development inside Rails app

| Comments

Often during app development, it's a good idea to extract some functionality into a gem. The simple way to do this is to open a new git repository, do a bundle gem foobar, publish it, install said gem inside Rails app, use it and test some more.

How about updates? We have to change the gem, guessing how it can be used inside Rails. Then release a version, install it inside your app, and finally do some testing. This is a lot of friction. This can be especially bad if your gem is closely coupled with the app, or gets updated a lot.

How about this?

  1. Create an empty gem (e.g. bundle gem foobar) (without doing any release)
  2. Push it onto Github.
  3. Put the gem in your Rails app as a submodule: git submodule add vendor/foobar
  4. Install the empty gem in your app's Gemfile: gem 'settei', path:'./vendor/foobar'
  5. ...
  6. Profit!

Now develop the gem entirely inside app's submodule. It's probably possible to autoload it (though I don't recommend it). This will also work in production too.

Rails' many default_url_options

| Comments

I have read so many different ways to set default_url_options. But at least in Rails 5.1.4, only some of them worked. The thing is, often one works for console but not for controller, or the opposite happens:

# in development.rb
config.action_controller.default_url_options({:protocol => 'https'})
config.action_controller.default_url_options(:protocol => 'https')
# Does not work

# in development.rb, outside config block
Rails.application.routes.default_url_options[:protocol] = 'https'
# Does not work, but works under console

# in routes.rb
Rails.application.routes.draw do
  default_url_options protocol: :https
# Does not work, but works under console

# in ApplicationController
def default_url_options(options={})
  { secure: true }
# Does not work

# in ApplicationController
def default_url_options
  { protocol: :https }
# Works in browser, but does not work under console

# in development.rb
config.action_controller.default_url_options= {:protocol => 'https'}
# Works in browser, but does not work under console

This means we probably want to set the options at two different places for it to work always. I think this caused many stackoverflow questions, and deserve to have a Rails repo issue.

Pundit and controller based authorization

| Comments

If we have an Order class, pundit gem will figure out to use the OrderPolicy for authorization. But what if we have multiple domains using the same item?

For example an online auction site, there will be a Seller::OrdersController and a Buyer::OrdersController. They will manage the two sides of the same order. The idea is that, buyer should be allowed to only update via Buyer::OrdersController but not Seller::OrdersController. And vice versa for seller. Obviously we would want to have two sets of policies.

However in Pundit namespaced policy require us to call authorze [:seller, item], in order for it to use the Seller::ItemPolicy. This can become repetitive.

For me, there is less room for error if we have a 1:1 relationship between controller and policy, and a controller can be assumed to use the same policy. So I patched authorize call so policy can be set on the controller level.


Put the following under ApplicationController as private methods.

  class_attribute :pundit_policy_class

  def self.set_pundit_policy_class(klass)
    self.pundit_policy_class = klass

  def authorize(record, query: nil, policy_class: nil)
    query ||= params[:action].to_s + "?"

    @_pundit_policy_authorized = true

    if policy_class
      policy =, record)
    elsif self.pundit_policy_class
      policy =, record)
      policy = policy(record)

    unless policy.public_send(query)
      raise NotAuthorizedError, query: query, record: record, policy: policy


In your controller, you can then do this to set policy:

class Seller::OrdersController < ApplicationController
  set_pundit_policy_class Seller::OrderPolicy

This means all actions under this controller will use Seller::OrderPolicy by default when you call authorize.

If we want to override this, we can also pass in policy_class in authorize:

   authorize @order, policy_class: FooPolicy

Note that I made some changes to authorize's method signature: query is a keyword argument now. It also returns the record object (as planned for its 1.2 release).

AdequateErrors - Overcoming limitation of Rails model errors API

| Comments

Over the years I encountered many issues related to ActiveModel::Errors API. After looking at the Rails source, I realized the original design was the root cause. errors was originally just a hash of array of String, which worked for simple requirements, but not for more complex ones.

In April I started collecting use cases, and study Rails source. Last month I finally put my hands on implementing a solution: a gem to apply object-oriented design principles to make each error an object, and provide new set of APIs to access these objects. I call it AdequateErrors.

AdequateErrors can be accessed by calling model.errors.adequate. It co-exists with existng Rails API, so nothing will break. But what issues does it solve? Let me list them one by one:

Query on errors using where

Imagine we need to access the empty error on any attributes:

model.errors.details.each {|attribute, errors|
  errors.find {|h|
    h[:error] == :empty

AdequateErrors provides a where method. Now we can stop using loops, and write complex queries:

model.errors.adequate.where(type: :empty)
model.errors.adequate.where(attribute: :title, type: :empty)
model.errors.adequate.where(attribute: :title, type: :empty, count: 3)

This returns an array of Error objects. Simple.

Access both the message and details of one particular error

If one attribute has two foo_error and one bar_error, e.g.:

# model.errors.details

{:name=>[{error: :foo_error, count: 1}, {error: :bar_error}, {error: :foo_error, count: 3}]}

How do you access the message on the third particular error? With the current implementation, we have to resort to using array indexes:


Or we can call generate_message to recreate a message from the details, but that's also tedious.

With AdequateErrors, we won't have this problem. Error is represented as an object, message and details are its attributes, so accessing those are straightforward:

e = model.errors.adequate.where(attribute: :title, type: :foo_error).first
e.message # full message

e.options # similar to details, where meta informations such as `:count` is stored.

Lazily evaluating message for internationalization

@morgoth mentioned this issue that when you're adding error, it's translated right away.

# actual:

I18n.with_locale(:pl) { user.error.full_messages } # => outputs EN errors always

# expecting:

I18n.with_locale(:pl) { user.error.full_messages } # => outputs PL errors

I18n.with_locale(:pt) { user.error.full_messages } # => outputs PT errors

Taking this into consideration, AdequateErrors lazily evaluates messages only when message is called.

Error message attribute prefix

Not all error messages start with the attribute name, but Rails forces you to do this. People have developed hacks to bypass this. Others simply assigned errors to :base instead of the actual attribute. This is ugly.

Here is AdequateErrors' solution. It has its own namespace in the locale file, and instead of the global default format "%{attribute} %{message}", the prefix is moved into each individual entries:




      invalid: "%{attribute} is invalid"

      inclusion: "%{attribute} is not included in the list"

      exclusion: "%{attribute} is reserved"

All built-in error types have been converted into this. If one wishes to have prefix-less error, simply have its entry in locale file without the %{attribute}.

Just less error prune code

I remember when I first learned about Object-Oriented design principle in uni, there was this example of payroll system. In the system, one array stores account name and another array stores account number. Whenever we need to delete an account, we need to manipulate both arrays. Further more, if we need to add a new attribute, we need to add a third array. It is very clear that objectifying this system can make it simpler and less error-prone.

This is what the current Rails errors implementation looks like:

def copy!(other) # :nodoc:
  @messages = other.messages.dup
  @details  = other.details.dup

def clear

def delete(key)
  attribute = key.to_sym

This being similar to the case I mentioned above, really can benefit from an object-oriented approach.


If you are a long-time Rails developer, chances are you have met similar issue before, please try this gem. If you have other usecases that you wish to improve on, I would like to know and see if it can be added into the gem. Happy hacking!

Magics that Decorator/Presenter Gems Do to Make Your Type Less

| Comments

Some of you Rails developers probably have used a 'decorator' or 'presenter' library. These libraries aim to bridge between Rails model and view layers. If I am to define it, a presenter allows developers to group helper methods related to a model to be under a namespace related to that model, instead of the current global space.

But would you believe it? There are actually a dozen or more decorator/presenter gems out there. Why do we reinvent the wheels? The first reason is that there is really a demand, because keeping large amount of helper methods under the same namespace is just unrealistic. The second reason is that, these gem owners have different views on this philosophical question: should the interface be implicit (things are done for the user under the hood) or explicit (user has to type more).

As a fun exercise I will compare 6 of these gems and explain how the general concept works. I have not used some of the gems, but only read the readme and some of the code, so if there are mistakes please let me know.

*Disclaimer, I am the owner of LulalalaPresenter gem. And if you have never used a presenter/decorator before, read this post to know why it is useful.

The following table represents the spectrum of these gems. On the left end, we see gems favoring implicity more. On the right end, gems are more explicit in nature.

Draper Oprah Display-case Lulalala
link link link link link link
Decorate (Quack like a model) Y optional Y Y
Decorates Association Y optional optional
Directly call helper method within decorator Y optional
Globally accessible view context Y Y Y
Automatically decorates Y
Presenter/Model mapping 1:1 N:1 N:1 N:1 1:1 1:1
Lines of code 281 718 130 375 110 38
Lines of test 473 3037 354 1556 294 n/a

(I find many more gems after writing this. A complete list can be found here. If you want to add your gem to it, let me know~)

"I hate magic" camp

The conservative rubyists prefer to avoid magic. They don't like to override too much things, and they write PORO instead of meta-programming.

The most simple example can be seen from Ryan Bate's RailsCast: "Presenters from Scratch". In the video he explains step by step how to make a simple presenter.

The only meta-programming used is how it infers the presenter class from the model. All other interfaces are simple object passing method calls. Only 38 lines are required to achieve this.

Since by default it does not act like a model, Ryan calls it a presenter instead of decorator. You can still delegate calls to model if you wish, but the readme indicates that presenter object should not be mixed up with model object.

I'll talk about LulalalaPresenter later after ActiveDecorator, because it is its fork.


According to Design Patterns in Ruby, the decorator pattern is a wrapper that "supports the same core interface, but adds its own twist on that interface." In this case we add view related functionality around the model. A decorator can act as if it is the model, which means less view changes are required.

ActiveDecorator, oprah and display-case all position themselves as decorators. Draper on the other hand gives the user freedom to choose if the wrapper should become a decorator or not.

One association further

Associations are part of the ActiveModel interface too, and some gems can decorate associations for you to save some key strokes (example). This is no small task, as there are multiple ways to trigger associations. It is therefore more possible to break across Rails versions.

Directly call helper methods within decorator

Normally for a model decorator to call helper methods within it, it needs to call via view_context (often alias as h), e.g. h.url_for(). Both ActiveDecorator and Draper offers a way to save key strokes so you can call url_for directly. This is done by a simple trick: if the model does not support a method, we retry it on view context again (example).

The implications are: 1. this is a wee bit slower because method_missing is utilized. 2. if same method name exists on model and view_context, model's method takes precedence. This is usually not a problem.

Globally accessible view_context

Draper, ActiveDecorator and LulalalaPresenter all keeps the view_context in a globally accessible place (example). This is done for two reasons:

  1. Give an OO-esque feel to the decoration method:

    Draper decorates by calling model.decorate
    LulalalaPresenter presents by calling model.presenter
    The design saves you from passing view_context, otherwise one will need to do model.decorate(view_context) all the time.

  2. To allow automatic decoration (see below).

The Holy Grail of Implicity

We have reached the end of implicity. To automatically decorate things, ActiveDecorator hooks into the render call, and decorates instance variables when applicable (example).

The gem walked the extra miles for you, so you don't have to do anything beside writing the decorator class. In some way, this feels like Rails philosophy, where we just write the controller/model/view, and things will just hook up perfectly without you knowing what's happening under the hood.

My Own Presenter and Conclusion

As you can see from the table, most are decorators. They behave like models, delegating calls to the inside. However if we treat presenter as a separate object, we can reduce a lot of the delegation complexities.

I was looking for a presenter which does not involve decorating ActiveModel, but I couldn't find one. Ryan's solution was the closest I could find. So I thought I can make my own.

I personally hate typing parenthesis, because my left little finger aches when holding shift key. Instead of present(model).foo, I prefer If we are to do this, how can presenter get hold of view_context object? We can pass it in everytime like this, but that's definitely too much too type. In the end, I found out ActiveDecorator's globally accessible view_context, so I used it to make my own. Hopefully this does not offend any one, I feel guilty but I want to please my little finger.

Do we need presenters/decorators pattern? Some may argue this is an offense to the MVC architecture, and some may think it introduces extra complexity. Again we will probably never get a consensus on this, but I use it because it made coding Rails more pleasant. So if you have large number of helper methods in your codebase, I recommend you to just pick a gem and try it out :)

I wrote an address tokenizer using machine learning

| Comments

A few years ago, I was assigned the task to extract the city/suburb names from our crawler results. I wrote a parser, using a bunch of if/else statements and regular expressions. It worked mostly, except in some extreme cases. In order to parse those extreme cases, I added more if statements and more obscure regular expressions. At the end I feel the code was very unreadable.

But was I an incompetent programmer? A few months ago I read a blog post about using machine learning to do address parsing, and I realized my old approach of creating rules, is not how our brains work. A lot of cases really requires us thinking in terms of possibility ("if there are more than three characters followed by this, it is probably a street"). These are fuzzy logics, but my if/ else regular expressions are discrete logics operating on a boolean level.

So as a pet project, I decided to implement an address parser in Ruby. In the Python community they already have Parserator. So why not in Rubyland? I am from Taiwan, so I also want to try applying that to addresses here.

I used the Conditional Random Fields model, though reading the Wikipedia article fried my brain:

I don't understand any of these. However I still keep my hopes that I can just copy & paste something and it would work out eventually. Though we don't know how to create a lego block, we can still build things using it without all the background knowledge right?

The first step is to gather the training data. My friend said that these are confidential, and can cost money. So I looked elsewhere. Eventually I found out that there are people adding address entries on this site called OpenStreetMap. Regional data can be downloaded at this site called Gisgraphy. The file is in .pbf, which stands for Protocolbuffer Binary Format. So I used pbf_parser gem to access the data inside. Not all data are for addresses, some are bus routes and some are geometry data. I wrote a parser to extract addresses into the a SQL database. There were around 15000 records.

Though in OSM people enters address in different sections such as city and suburb, in reality it is not strictly followed as to which field represents what. This is especially true in Eastern countries. there are a few distinct levels which does not have an English counterpart. People also puts the full address in the street field and the like. So I have to write scripts to boldly move the data around the columns, add new columns to match Taiwanese address rules. I feel I have touched more than 2/3 of the addresses. I call this part cleaning.

Once cleaning is done, all we have to do is to feed those data in to train the model. Sylvester Keil wrote two Rubygems to do CRF training, one of which is called wapiti. It is a wrapper to a C library of the same name. He was very kind and helped me when I wanted to know how to use the gem.

Eventually I was able to feed my data into wapiti and create a model file. Some East-Asian languages have the property that pharaes are not separated by space characters, I have to chop the address into individual characters, and then feed them in. On the other end, when the model determines the result, I then have to combine neighbouring characters of the same label back into a phrase.

The result was much better than I expected, it can parse common addresses just fine. All of these are me writing no rules at all. I created a website for people to try out, so I can also gather some new data.

People do inform me extreme cases where the tokenization fails. As my first time writing something using Machine Learning, the feeling is quite different, as something like this:

if result.wrong?
  say "Not me! It's its fault!
       The machine is too stupid to learn~~"
  guilt = 0 # do not feel guilty at all~
  say "Hehe"
  feels "complimented"
  happiness += 100

I provided a gem ( and provided a model file. The gem is intended for East Asian addresses (Chinese, Japanese and Korean), so if you are in these region, please try create your own model. Once you plug it in, it should just work. Once I have time, I plan to put my training data online for others to make correction on.

Always use respond_to

| Comments

We have a controller action which has a error handling state.

  if error
    flash[:error] = render_to_string(partial: "foo_error").html_safe
    render :error # renders error.js.erb

However eventhough the js is rendered, the response content-type is still "text/html". (The request Accept is set to
*/*;q=0.5, text/javascript, application/javascript, application/ecmascript, application/x-ecmascript
) This is very puzzling to us, because this always worked for us.

Later we found out that the root cause was because we call render_to_string. If we remove that call, Rails will again be able to guess the content-type as text/javascript. Somewhere in the Rails internal must have set the type when render_to_string is called, eventhough it is not directly used by the response rendering.

Using respond_to would also solve this issue, forcing Rails to return text/javascript as the content type.

  if error
    flash[:error] = render_to_string(partial: "foo_error").html_safe
    respond_to do |format|
      format.js { render :error }

I guess we should always specify respond_to, so the content-type can be deterministic.

RubyKaigi 2015 感想

| Comments

參加了去年年底的 Ruby Kaigi 2015 ,所有演講的錄影都已經放在網路上了



本次大會由 Matz 開場,他提出了「Ruby 3x3」的口號,也就是希望 Ruby 3 的速度想變成 Ruby 2 的三倍。而響應這個期許,許多演講也圍繞著這個主題探討,有時候一個技術在不同的主題或抽象層被提到,讓我印象加深不少。


第一天第二場就是 IBM 的 Experiments in sharing Java VM technology with CRuby。主題是它開發的 OMR 技術,近日快要開源。這計畫主要是想把 IBM 自己的 JVM 引擎 J9 裡面一些跟語言實做特別常見的模組開源,並希望其他語言能夠納入自己的實做。這樣子的好處是許多語言不用重複開發輪子,共享技術。這些模組改進時,語言也就免費得到的改進。第三天 IBM 更進一步介紹了 OMR 的 GC 原理 ,以及其他 Ruby 可能最佳化的方式。可見 IBM 真的很大力推廣這塊。

關於 OMR 的一些臆測,這裡有篇有趣的中文文章可以讀讀

同時在 JRuby ,也有 Oracle 介紹 Truffle 是怎樣讓元編程也有良好的速度


最後一天最後一場則是呼應之前幾場,由 Rubinius 的開發者 Evan Phoenix 所寫的 Key note Ruby 2020。以往像是我這樣的小開發者,都覺的要是 Ruby 很慢的話,把核心功能改用 C 寫,就是最能加速的方法。但是這會讓近年來 JIT 各種加速的技術無用武之地,因為 C 程式本身會像是黑盒子般無法探究內側邏輯,而加速技術很依賴知道程式接下來會作什麼來優化。把 Ruby 的核心庫重新用 Ruby 寫一遍是個解法,但是這樣工程太浩大所以不太現實。

所以 Evan 想提議,把 C 寫的核心庫採用 LLVM 作處理,最後 Ruby 跟 C 會落在同樣的層面,這樣就有如 JIT 之類技術運作的空間了。當然這還需要更多研究,不過作者本身覺的是可行的,讓外行人如我覺的十分興奮。這個演講說得很簡單,讓對 VM 一竅不通的我也能理解發生了什麼,十分推薦大家去聽聽看。

High Performance Template Engine: Guide to optimize your Ruby code

兩位同一家公司的開發者各自開發了自己的 Haml 引擎,想要加快 Haml 的速度。他們探討了模板引擎的原理,以及 Haml 的瓶頸還有克服的方法。適合已經對 Ruby 駕輕就熟的人讀讀,有趣好懂。

Charming Robots

使用 Ruby 控制跳舞墊控制無人機~要說最有趣的 demo 就屬於這個了。請看影片

Plugin-based software design with Ruby and RubyGems

寫開源程式常常會希望自己寫出來的東西能夠有良好的擴展性,比如像是 Firefox 能夠安裝插件一樣。這篇提到 Treasure Data 如何嘗試有插件的架構。成果是插件本身是 gem ,有相依性的管理等等。希望有一天能花點時間弄懂。

Turbo Rails with Rust

介紹了 Rust 語言,以及 Ruby 能怎樣引用它的程式。算是推坑成功,有點想找時間學學看。最有印象的是他提到 static dispatch 很重要,因為這是其他最佳化的前提。





食物與 Party

主辦人松田明提到,這次辦在築地,就是因為 2016 年築地市場就要關了,所以他希望大家能趁這機會去試試看各式壽司或海產(這也是本次大會是使用壽司當作圖像意象的發想的原因)。當然就是 2000¥ 3000¥ 一直丟囉,好吃!

每天都有 Party ,除了主辦單位有,各家公司也自己都有舉辦,聽說從會議前的週二就開始,我參加了週四的 Heroku Pre-Party ,第一個聊到天的人就是 Metaprogramming Ruby 的作者 Paolo Perrotta ,十分友善。

有一天是卡啦 OK,在銀座裡面,因為很高檔所以要 5000¥ (倒)比較可憐的是有些外國人,因為現場都是唱卡通歌,而且沒有輪唱的概念,所以他們點不到也唱不到什麼歌。這種卡啦OK就是應該要規定每個人最多點一首歌才對呀~不過,看到 JRuby 的 Charles Nutter 大大拿著手機螢幕看羅馬拼音唱棋靈王的主題曲便是無價~

Ruby Kaigi 2016

下次的 Ruby Kaigi 已經確定要在京都舉辦,9/8 ~ 9/10 ,標在年曆上期待中~