Get tips & tricks to optimize your ID verification flow

Regexp#match?Ruby 2.4 adds a new #match? method for regular expressions which is three times faster than any Regexp method in Ruby 2.3:
Regexp#match?: 2630002.5 i/s
Regexp#===: 872217.5 i/s - 3.02x slower
Regexp#=~: 859713.0 i/s - 3.06x slower
Regexp#match: 539361.3 i/s - 4.88x slower
Expand benchmark sourceWhen you call Regexp#===, Regexp#=~, or Regexp#match, Ruby sets the $~ global variable with the resulting MatchData:
/^foo (\w+)$/ =~ 'foo bar' # => 0
$~ # => #<MatchData "foo bar" 1:"bar">
/^foo (\w+)$/.match('foo baz') # => #<MatchData "foo baz" 1:"baz">
$~ # => #<MatchData "foo baz" 1:"baz">
/^foo (\w+)$/ === 'foo qux' # => true
$~ # => #<MatchData "foo qux" 1:"qux">
Regexp#match? returns a boolean and avoids building a MatchData object or updating global state:
/^foo (\w+)$/.match?('foo wow') # => true
$~ # => nil
By skipping the global variable Ruby is able to avoid work allocating memory for the MatchData.
#sum method for EnumerableYou can now call #sum on any Enumerable object:
[1, 1, 2, 3, 5, 8, 13, 21].sum # => 54
The #sum method has an optional parameter which defaults to 0. This value is the starting value of a summation meaning that [].sum is 0.
If you are calling #sum on an array of non-integers then you need to provide your own initial value:
class ShoppingList
attr_reader :items
def initialize(*items)
@items = items
end
def +(other)
ShoppingList.new(*items, *other.items)
end
end
eggs = ShoppingList.new('eggs') # => #<ShoppingList:0x007f952282e7b8 @items=["eggs"]>
milk = ShoppingList.new('milks') # => #<ShoppingList:0x007f952282ce68 @items=["milks"]>
cheese = ShoppingList.new('cheese') # => #<ShoppingList:0x007f95228271e8 @items=["cheese"]>
eggs + milk + cheese # => #<ShoppingList:0x007f95228261d0 @items=["eggs", "milks", "cheese"]>
[eggs, milk, cheese].sum # => #<TypeError: ShoppingList can't be coerced into Integer>
[eggs, milk, cheese].sum(ShoppingList.new) # => #<ShoppingList:0x007f9522824cb8 @items=["eggs", "milks", "cheese"]>
On the last line an empty shopping list (ShoppingList.new) is supplied as the initial value.
In Ruby 2.4 you can test whether directories and files are empty using the File and Dir modules:
Dir.empty?('empty_directory') # => true
Dir.empty?('directory_with_files') # => false
File.empty?('contains_text.txt') # => false
File.empty?('empty.txt') # => true
The File.empty? method is equivalent to File.zero? which is already available in all supported Ruby versions:
File.zero?('contains_text.txt') # => false
File.zero?('empty.txt') # => true
Unfortunately these methods are not available for Pathname yet.
Regexp match resultsIn Ruby 2.4 you can called #named_captures on a Regexp match result and get a hash containing your named capture groups and the data they extracted:
pattern = /(?<first_name>John) (?<last_name>\w+)/
pattern.match('John Backus').named_captures # => { "first_name" => "John", "last_name" => "Backus" }
Ruby 2.4 also adds a #values_at method for extracting just the named captures which you care about:
pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
pattern.match('2016-02-01').values_at(:year, :month) # => ["2016", "02"]
The #values_at method also works for positional capture groups:
pattern = /(\d{4})-(\d{2})-(\d{2})$/
pattern.match('2016-07-18').values_at(1, 3) # => ["2016", "18"]
Integer#digits methodIf you want to access a digit in a certain position within an integer (from right to left) then you can use Integer#digits:
123.digits # => [3, 2, 1]
123.digits[0] # => 3
# Equivalent behavior in Ruby 2.3:
123.to_s.chars.map(&:to_i).reverse # => [3, 2, 1]
If you want to know positional digit information given a non-decimal base, you can pass in a different radix. For example, to lookup positional digit information for a hexadecimal integer you can pass in 16:
0x7b.digits(16) # => [11, 7]
0x7b.digits(16).map { |digit| digit.to_s(16) } # => ["b", "7"]
Logger interfaceThe Logger library in Ruby 2.3 can be a bit cumbersome to setup:
logger1 = Logger.new(STDOUT)
logger1.level = :info
logger1.progname = 'LOG1'
logger1.debug('This is ignored')
logger1.info('This is logged')
# >> I, [2016-07-17T23:45:30.571508 #19837] INFO -- LOG1: This is logged
Ruby 2.4 moves this configuration to Logger’s constructor:
logger2 = Logger.new(STDOUT, level: :info, progname: 'LOG2')
logger2.debug('This is ignored')
logger2.info('This is logged')
# >> I, [2016-07-17T23:45:30.571556 #19837] INFO -- LOG2: This is logged
Parsing command line flags with OptionParser often involves a lot of boilerplate in order to compile the options down into a hash:
require 'optparse'
require 'optparse/date'
require 'optparse/uri'
config = {}
cli =
OptionParser.new do |options|
options.define('--from=DATE', Date) do |from|
config[:from] = from
end
options.define('--url=ENDPOINT', URI) do |url|
config[:url] = url
end
options.define('--names=LIST', Array) do |names|
config[:names] = names
end
end
Now you can provide a hash via the :into keyword argument when parsing arguments:
require 'optparse'
require 'optparse/date'
require 'optparse/uri'
cli =
OptionParser.new do |options|
options.define '--from=DATE', Date
options.define '--url=ENDPOINT', URI
options.define '--names=LIST', Array
end
config = {}
args = %w[
--from 2016-02-03
--url https://blog.blockscore.com/
--names John,Daniel,Delmer
]
cli.parse(args, into: config)
config.keys # => [:from, :url, :names]
config[:from] # => #<Date: 2016-02-03 ((2457422j,0s,0n),+0s,2299161j)>
config[:url] # => #<URI::HTTPS https://blog.blockscore.com/>
config[:names] # => ["John", "Daniel", "Delmer"]
Array#min and Array#maxIn Ruby 2.4 the Array class defines its own #min and #max instance methods. This change dramatically speeds up the #min and #max methods on Array:
Array#min: 35.1 i/s
Enumerable#min: 21.8 i/s - 1.61x slower
Expand benchmark sourceUntil Ruby 2.4 you had to manage many numeric types:
# Find classes which subclass the base "Numeric" class:
numerics = ObjectSpace.each_object(Module).select { |mod| mod < Numeric }
# In Ruby 2.3:
numerics # => [Complex, Rational, Bignum, Float, Fixnum, Integer, BigDecimal]
# In Ruby 2.4:
numerics # => [Complex, Rational, Float, Integer, BigDecimal]
Now Fixnum and Bignum are implementation details that Ruby manages for you. This should help avoid subtle bugs like this:
def categorize_number(num)
case num
when Fixnum then 'fixed number!'
when Float then 'floating point!'
end
end
# In Ruby 2.3:
categorize_number(2) # => "fixed number!"
categorize_number(2.0) # => "floating point!"
categorize_number(2 ** 500) # => nil
# In Ruby 2.4:
categorize_number(2) # => "fixed number!"
categorize_number(2.0) # => "floating point!"
categorize_number(2 ** 500) # => "fixed number!"
If you have Bignum or Fixnum hardcoded in your source code that is fine. These constants now point to Integer:
Fixnum # => Integer
Bignum # => Integer
Integer # => Integer
#round, #ceil, #floor, and #truncate now accept a precision argument
4.55.ceil(1) # => 4.6
4.55.floor(1) # => 4.5
4.55.truncate(1) # => 4.5
4.55.round(1) # => 4.6
These methods all work the same on Integer as well:
4.ceil(1) # => 4.0
4.floor(1) # => 4.0
4.truncate(1) # => 4.0
4.round(1) # => 4.0
Consider the following sentence:
My name is JOHN. That is spelled J-Ο-H-N
Calling #downcase on this string in Ruby 2.3 produces this output:
my name is john. that is spelled J-Ο-H-N
This is because “J-Ο-H-N” in the string above is written with unicode characters.
Ruby’s letter casing methods now handle unicode properly:
sentence = "\uff2a-\u039f-\uff28-\uff2e"
sentence # => "J-Ο-H-N"
sentence.downcase # => "j-ο-h-n"
sentence.downcase.capitalize # => "J-ο-h-n"
sentence.downcase.capitalize.swapcase # => "j-Ο-H-N"
When creating a string you can now define a :capacity option which will tell Ruby how much memory it should allocate for your string. This can help performance as Ruby can avoid reallocations as you increase the size of the string in question:
With capacity: 37225.1 i/s
Without capacity: 16031.3 i/s - 2.32x slower
Expand benchmark sourceRuby 2.3’s Symbol#match returned the match position even though String#match returns MatchData. This inconsistency is fixed in Ruby 2.4:
# Ruby 2.3 behavior:
'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
:'foo bar'.match(/^foo (\w+)$/) # => 0
# Ruby 2.4 behavior:
'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
:'foo bar'.match(/^foo (\w+)$/) # => #<MatchData "foo bar" 1:"bar">
You can now assign multiple variables within a conditional:
branch1 =
if (foo, bar = %w[foo bar])
'truthy'
else
'falsey'
end
branch2 =
if (foo, bar = nil)
'truthy'
else
'falsey'
end
branch1 # => "truthy"
branch2 # => "falsey"
You probably shouldn’t do that though.
If you encounter an exception within a thread then Ruby defaults to silently swallowing up that error:
puts 'Starting some parallel work'
thread =
Thread.new do
sleep 1
fail 'something very bad happened!'
end
sleep 2
puts 'Done!'
$ ruby parallel-work.rb
Starting some parallel work
Done!
If you want to fail the entire process when an exception happens within a thread then you can use Thread.abort_on_exception = true. Adding this to the parallel-work.rb script above would change the output to:
$ ruby parallel-work.rb
Starting some parallel work
parallel-work.rb:9:in 'block in <main>': something very bad happened! (RuntimeError)
In Ruby 2.4 you now have a middle ground between errors being silently ignored and aborting your entire program. Instead of abort_on_exception you can set Thread.report_on_exception = true:
$ ruby parallel-work.rb
Starting some parallel work
#<Thread:0x007ffa628a62b8@parallel-work.rb:6 run> terminated with exception:
parallel-work.rb:9:in 'block in <main>': something very bad happened! (RuntimeError)
Done!
Get periodic emails with more posts like this (no spam!)