本章其实已经脱离《An Http Request Through Rails》的范畴,仅仅作为学习之需。由于Active Record并不是一个完整的运作流程,本章只能通过多个例子解析Active Record的几个方面。
另外,本章也混入了大量Active Model的部分,但由于Active Model通常也不独立使用,因此本文直接混合二者而不做区分。
还有,由于Active Record中ORM对象的就是ActiveRecord::Base
对象,为了简化,本文将直接称其为Active Record对象。
首先就以一段最基本的代码开始吧,数据库是SQLite3,这里用的Active Record对象是User,没有任何特别的属性:
User.all.to_a
首先all
方法属于ActiveRecord::Querying
的delegate
,提供了多种ORM的query方法的scoped
的delegate
,目标都是scoped
:
delegate :find, :first, :first!, :last, :last!, :all, :exists?, :any?, :many?, :to => :scoped
delegate :first_or_create, :first_or_create!, :first_or_initialize, :to => :scoped
delegate :destroy, :destroy_all, :delete, :delete_all, :update, :update_all, :to => :scoped
delegate :find_each, :find_in_batches, :to => :scoped
delegate :select, :group, :order, :except, :reorder, :limit, :offset, :joins,
:where, :preload, :eager_load, :includes, :from, :lock, :readonly,
:having, :create_with, :uniq, :to => :scoped
delegate :count, :average, :minimum, :maximum, :sum, :calculate, :pluck, :to => :scoped
除此以外还提供了两个可以直接用SQL查询的方法,find_by_sql
和count_by_sql
。
接着就执行了scoped
方法,scoped
是ActiveRecord::Scoping::Named
模块下的类方法:
# Returns an anonymous \scope.
#
# posts = Post.scoped
# posts.size # Fires "select count(*) from posts" and returns the count
# posts.each {|p| puts p.name } # Fires "select * from posts" and loads post objects
#
# fruits = Fruit.scoped
# fruits = fruits.where(:color => 'red') if options[:red_only]
# fruits = fruits.limit(10) if limited?
#
# Anonymous \scopes tend to be useful when procedurally generating complex
# queries, where passing intermediate values (\scopes) around as first-class
# objects is convenient.
#
# You can define a \scope that applies to all finders using
# ActiveRecord::Base.default_scope.
def scoped(options = nil)
if options
scoped.apply_finder_options(options)
else
if current_scope
current_scope.clone
else
scope = relation
scope.default_scoped = true
scope
end
end
end
scoped
方法允许传入多个查询选项来实现更多的查询,但我们这里并不传入任何参数。对于已经设置了scope的代码来说,调用scoped会把当前scope克隆后返回回去,不过我们这里并没有设置scope,因此将调用relation
方法返回一个ActiveRecord::Relation
对象,设置default scope,然后返回。
relation
方法定义在ActiveRecord::Base
中,这个类众所周知是Active Record的核心类。relation
方法的实现如下:
def relation
relation = Relation.new(self, arel_table)
if finder_needs_type_condition?
relation.where(type_condition).create_with(inheritance_column.to_sym => sti_name)
else
relation
end
end
这里一开始就创建了ActiveRecord::Relation
类的对象,然后添加where语句以支持STI。该类暂时存储了当前所有查询条件,是实现Lazy Query的核心,它的构造函数的实现是:
ASSOCIATION_METHODS = [:includes, :eager_load, :preload]
MULTI_VALUE_METHODS = [:select, :group, :order, :joins, :where, :having, :bind]
SINGLE_VALUE_METHODS = [:limit, :offset, :lock, :readonly, :from, :reordering, :reverse_order, :uniq]
def initialize(klass, table)
@klass, @table = klass, table
@implicit_readonly = nil
@loaded = false
@default_scoped = false
SINGLE_VALUE_METHODS.each {|v| instance_variable_set(:"@#{v}_value", nil)}
(ASSOCIATION_METHODS + MULTI_VALUE_METHODS).each {|v| instance_variable_set(:"@#{v}_values", [])}
@extensions = []
@create_with_value = {}
end
可以看到这里针对所有可能的查询条件都初始化好了实例变量。
实例化Relation
对象时调用到了arel_table
方法,这个方法实现在ActiveRecord::Base
中:
def arel_table
@arel_table ||= Arel::Table.new(table_name, arel_engine)
end
这里首先先确定了当前类对应的数据库的表名,方法是table_name
,定义在ActiveRecord::ModelSchema
模块中,activerecord-3.2.13/lib/active_record/model_schema.rb
文件内,这个模块与Schema相关,针对例如与表,列,序列这样的数据库操作,table_name
的实现方法是:
def table_name
reset_table_name unless defined?(@table_name)
@table_name
end
对于还没有设定@table_name
变量的情况,首先要调用reset_table_name
去计算出一个表名,实现是:
# Computes the table name, (re)sets it internally, and returns it.
def reset_table_name
if abstract_class?
self.table_name = if superclass == Base || superclass.abstract_class?
nil
else
superclass.table_name
end
elsif superclass.abstract_class?
self.table_name = superclass.table_name || compute_table_name
else
self.table_name = compute_table_name
end
end
可以看到,如果自身是abstract_class
或是父类是abstract_class
的话,则根据STI的规定继承了父类的表名,否则,调用compute_table_name
方法计算出一个表名:
# Computes and returns a table name according to default conventions.
def compute_table_name
base = base_class
if self == base
# Nested classes are prefixed with singular parent table name.
if parent < ActiveRecord::Base && !parent.abstract_class?
contained = parent.table_name
contained = contained.singularize if parent.pluralize_table_names
contained += '_'
end
"#{full_table_name_prefix}#{contained}#{undecorated_table_name(name)}#{table_name_suffix}"
else
# STI subclasses always use their superclass' table.
base.table_name
end
end
首先要找出一个被用于计算表名的类对象,这里调用base_class
的实现:
# Returns the base AR subclass that this class descends from. If A
# extends AR::Base, A.base_class will return A. If B descends from A
# through some arbitrarily deep hierarchy, B.base_class will return A.
#
# If B < A and C < B and if A is an abstract_class then both B.base_class
# and C.base_class would return B as the answer since A is an abstract_class.
def base_class
class_of_active_record_descendant(self)
end
# Returns the class descending directly from ActiveRecord::Base or an
# abstract class, if any, in the inheritance hierarchy.
def class_of_active_record_descendant(klass)
if klass == Base || klass.superclass == Base || klass.superclass.abstract_class?
klass
elsif klass.superclass.nil?
raise ActiveRecordError, "#{name} doesn't belong in a hierarchy descending from ActiveRecord"
else
class_of_active_record_descendant(klass.superclass)
end
end
这里的规则基本上按照base_class
的注释所描述的那样,不再翻译。
随后,注意compute_table_name
里的parent
方法,这个方法来自于activesupport-3.2.13/lib/active_support/core_ext/module/introspection.rb
的core hack,当该类是某个类或是模块的内部类的时候,返回其外部模块或类,代码相当简单,请大家自行阅读。
当该类parent
也是ActiveRecord::Base
类,并且不是抽象类的话,这里将parent
类的表名取出,如果发现是复数的话,转换成单数作为前缀加在表名的前面。
full_table_name_prefix
搜查所有parents
的table_name_prefix
属性,如果都没有则使用当前类的table_name_prefix
属性:
def full_table_name_prefix
(parents.detect{ |p| p.respond_to?(:table_name_prefix) } || self).table_name_prefix
end
这里的parents
方法相当于parent
方法的数组版本,将以数组的形式返回所有外部模块或类,直到Object
为止。
计算表名的核心方法是undecorated_table_name
:
# Guesses the table name, but does not decorate it with prefix and suffix information.
def undecorated_table_name(class_name = base_class.name)
table_name = class_name.to_s.demodulize.underscore
table_name = table_name.pluralize if pluralize_table_names
table_name
end
这个方法非常简单,无需详细解释。另外如果是STI的话总是搜索父类的表名。好了,表名的解释到此为止,接着是初始化Arel::Table
需要的第二个参数arel_engine
:
def arel_engine
@arel_engine ||= begin
if self == ActiveRecord::Base
ActiveRecord::Base
else
connection_handler.retrieve_connection_pool(self) ? self : superclass.arel_engine
end
end
end
这里第一次提到了connection,因此有必要提及Active Record的数据库初始化,代码在ActiveRecord::Railtie
内:
# This sets the database configuration from Configuration#database_configuration
# and then establishes the connection.
initializer "active_record.initialize_database" do |app|
ActiveSupport.on_load(:active_record) do
db_connection_type = "DATABASE_URL"
unless ENV['DATABASE_URL']
db_connection_type = "database.yml"
self.configurations = app.config.database_configuration
end
Rails.logger.info "Connecting to database specified by #{db_connection_type}"
establish_connection
end
end
这段代码主要是establish_connection
方法,它初始化了数据库相关部分:
def self.establish_connection(spec = ENV["DATABASE_URL"])
resolver = ConnectionSpecification::Resolver.new spec, configurations
spec = resolver.spec
unless respond_to?(spec.adapter_method)
raise AdapterNotFound, "database configuration specifies nonexistent #{spec.config[:adapter]} adapter"
end
remove_connection
connection_handler.establish_connection name, spec
end
ConnectionSpecification::Resolver
定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/connection_specification.rb
中,其功能是创建所需的ConnectionSpecification
对象,调用spec
方法即可进行解析:
def spec
case config
when nil
raise AdapterNotSpecified unless defined?(Rails.env)
resolve_string_connection Rails.env
when Symbol, String
resolve_string_connection config.to_s
when Hash
resolve_hash_connection config
end
end
对于config
,实质就是Rails环境,如果没有指定,则指定成Rails.env
。然后执行resolve_string_connection
方法:
def resolve_string_connection(spec) # :nodoc:
hash = configurations.fetch(spec) do |k|
connection_url_to_hash(k)
end
raise(AdapterNotSpecified, "#{spec} database is not configured") unless hash
resolve_hash_connection hash
end
这里通过前面指定的Rails环境获取到database.yml
设置的数据库信息,如果不能获取到,则spec
可能是一个URL,将调用connection_url_to_hash
解析这个URL:
def connection_url_to_hash(url) # :nodoc:
config = URI.parse url
adapter = config.scheme
adapter = "postgresql" if adapter == "postgres"
spec = { :adapter => adapter,
:username => config.user,
:password => config.password,
:port => config.port,
:database => config.path.sub(%r{^/},""),
:host => config.host }
spec.reject!{ |_,value| value.blank? }
spec.map { |key,value| spec[key] = URI.unescape(value) if value.is_a?(String) }
if config.query
options = Hash[config.query.split("&").map{ |pair| pair.split("=") }].symbolize_keys
spec.merge!(options)
end
spec
end
随后调用resolve_hash_connection
方法:
def resolve_hash_connection(spec) # :nodoc:
spec = spec.symbolize_keys
raise(AdapterNotSpecified, "database configuration does not specify adapter") unless spec.key?(:adapter)
begin
require "active_record/connection_adapters/#{spec[:adapter]}_adapter"
rescue LoadError => e
raise LoadError, "Please install the #{spec[:adapter]} adapter: `gem install activerecord-#{spec[:adapter]}-adapter` (#{e.message})", e.backtrace
end
adapter_method = "#{spec[:adapter]}_connection"
ConnectionSpecification.new(spec, adapter_method)
end
这里将根据设定的adapter信息加载数据库相应的Adapter类,然后创建了相应的ConnectionSpecification
对象。
随后,回到之前的establish_connection
方法,必须判断ActiveRecord::Base
是否加载了相应的适配器方法"#{adapther_name"}_connection
,否则抛出异常。为了防止重复连接这里又调用了remove_connection
:
def remove_connection(klass = self)
connection_handler.remove_connection(klass)
end
可以看到是connection_handler
的代理方法,它的定义如下,这里的connection_handler
是ConnectionAdapters::ConnectionHandler
的实例:
# Remove the connection for this class. This will close the active
# connection and the defined connection (if they exist). The result
# can be used as an argument for establish_connection, for easily
# re-establishing the connection.
def remove_connection(klass)
pool = @class_to_pool.delete(klass.name)
return nil unless pool
@connection_pools.delete pool.spec
pool.automatic_reconnect = false
pool.disconnect!
pool.spec.config
end
不过因为这里还没有作过任何连接所以其实不会做什么事情。最后调用了connection_handler.establish_connection
方法建立连接:
def establish_connection(name, spec)
@connection_pools[spec] ||= ConnectionAdapters::ConnectionPool.new(spec)
@class_to_pool[name] = @connection_pools[spec]
end
从代码中可以看到,connection_handler
的@connection_pool
是一个以ConnectionSpecification
对象为Key,ConnectionAdapters::ConnectionPool
对象为Value的Hash,而@class_to_pool
则是以类名为Key,ConnectionAdapters::ConnectionPool
为Value的Hash。需要说明的是,ConnectionHandler
类的作用就是维护这两个重要的实例变量。
这里的方法虽然并没有真正的建立一个连向数据库的connection,但是已经做好了准备,当第一次执行ActiveRecord::Base.connection
方法的时候就会真正的建立连接:
# Returns the connection currently associated with the class. This can
# also be used to "borrow" the connection to do database work unrelated
# to any of the specific Active Records.
def connection
retrieve_connection
end
def retrieve_connection
connection_handler.retrieve_connection(self)
end
connection_handler.retrieve_connection
的实现是:
# Locate the connection of the nearest super class. This can be an
# active or defined connection: if it is the latter, it will be
# opened and set as the active connection for the class it was defined
# for (not necessarily the current class).
def retrieve_connection(klass)
pool = retrieve_connection_pool(klass)
(pool && pool.connection) or raise ConnectionNotEstablished
end
def retrieve_connection_pool(klass)
pool = @class_to_pool[klass.name]
return pool if pool
return nil if ActiveRecord::Base == klass
retrieve_connection_pool klass.superclass
end
这里相当于之前初始化操作的逆操作,将取出对应的ConnectionAdapters::ConnectionPool
对象(如果找不到则直接调用父类,顺便可以支持STI),然后执行connection
方法:
# Retrieve the connection associated with the current thread, or call
# #checkout to obtain one if necessary.
#
# #connection can be called any number of times; the connection is
# held in a hash keyed by the thread id.
def connection
synchronize do
@reserved_connections[current_connection_id] ||= checkout
end
end
@reserved_connections
由ConnectionPool
维护。这里确定当前连接ID的方法是这样的:
def current_connection_id
ActiveRecord::Base.connection_id ||= Thread.current.object_id
end
由此可以看到,@reserved_connections
维护同一ConnectionPool
里不同线程的Connection,不同线程不同时共享Connection。
这里返回或是建立连接的方法是checkout
:
def checkout
synchronize do
waited_time = 0
loop do
conn = @connections.find { |c| c.lease }
unless conn
if @connections.size < @size
conn = checkout_new_connection
conn.lease
end
end
if conn
checkout_and_verify conn
return conn
end
if waited_time >= @timeout
raise ConnectionTimeoutError, "could not obtain a database connection#{" within #{@timeout} seconds" if @timeout} (waited #{waited_time} seconds). The max pool size is currently #{@size}; consider increasing it."
end
# Sometimes our wait can end because a connection is available,
# but another thread can snatch it up first. If timeout hasn't
# passed but no connection is avail, looks like that happened --
# loop and wait again, for the time remaining on our timeout.
before_wait = Time.now
@queue.wait( [@timeout - waited_time, 0].max )
waited_time += (Time.now - before_wait)
# Will go away in Rails 4, when we don't clean up
# after leaked connections automatically anymore. Right now, clean
# up after we've returned from a 'wait' if it looks like it's
# needed, then loop and try again.
if(active_connections.size >= @connections.size)
clear_stale_cached_connections!
end
end
end
end
从代码中可见,一开始先从@connecions
中找到一个lease
返回有效值的连接,其中lease
的实现定义在ActiveRecord::ConnectionAdapters
中,这个类是所有数据库Adapter的基类:
def lease
synchronize do
unless in_use
@in_use = true
@last_use = Time.now
end
end
end
可以看到只有当连接没有被使用的时候lease
方才返回有效值。如果没有找到并且@connections
里的连接没有超过上限(默认是5),则执行checkout_new_connection
方法创建一个新的连接:
def checkout_new_connection
raise ConnectionNotEstablished unless @automatic_reconnect
c = new_connection
c.pool = self
@connections << c
c
end
@automatic_reconnect
不能为false
,也就是说不能已经被remove_connection
了。
new_connection
方法会调用Adapter的代码:
def new_connection
ActiveRecord::Base.send(spec.adapter_method, spec.config)
end
这里将初始化所需的Adapter,值得注意的是,所有Adapter的父类都是ActiveRecord::ConnectionAdapters::AbstractAdapter
,我们这里来简单看下AbstractAdapter
的初始化代码:
def initialize(connection, logger = nil, pool = nil)
super()
@active = nil
@connection = connection
@in_use = false
@instrumenter = ActiveSupport::Notifications.instrumenter
@last_use = false
@logger = logger
@open_transactions = 0
@pool = pool
@query_cache = Hash.new { |h,sql| h[sql] = {} }
@query_cache_enabled = false
@schema_cache = SchemaCache.new self
@visitor = nil
end
初始化代码很简答,不过这里我们需要关心的是SchemaCache
对象的初始化,这个类负责维护表中Column和主键的信息,定义在activerecord-3.2.13/lib/active_record/connection_adapters/schema_cache.rb
中:
def initialize(conn)
@connection = conn
@tables = {}
@columns = Hash.new do |h, table_name|
h[table_name] = conn.columns(table_name, "#{table_name} Columns")
end
@columns_hash = Hash.new do |h, table_name|
h[table_name] = Hash[columns[table_name].map { |col|
[col.name, col]
}]
end
@primary_keys = Hash.new do |h, table_name|
h[table_name] = table_exists?(table_name) ? conn.primary_key(table_name) : nil
end
end
这里先后调用Adapter的方法创建好了@columns
,@columns_hash
和@primary_keys
三个对象。
new_connection
创建好连接之后,对连接执行lease
方法将其标记为已经使用。随后执行checkout_and_verify
方法:
def checkout_and_verify(c)
c.run_callbacks :checkout do
c.verify!
end
c
end
这个方法执行了:checkout
这个Callback,传入了执行针对连接的verify!
方法的block,其中verify!
方法主要是验证连接是否有效,如果无效则重新连接:
# Checks whether the connection to the database is still active (i.e. not stale).
# This is done under the hood by calling <tt>active?</tt>. If the connection
# is no longer active, then this method will reconnect to the database.
def verify!(*ignored)
reconnect! unless active?
end
具体判断是否active?
的方法以及重新连接的代码取决于Adapter的实现,这里不再深入。
这样一个checkout连接的过程就完成了,如果之前没有找到空闲的连接,但是@connections
里的连接已满,此时就只能等待一段时间(这里调用了new_cond
的wait
方法,等到有线程用完connection之后向该conditional variable发出信号(这个可以参看checkin
方法的实现),或者预设的时间已经用完),然后试图清理掉已经执行结束的线程中的连接以换取更多可用的连接,然后循环再次重复上述checkout的过程,直到最终超时抛出错误为止。
这样,关于数据库连接的初始化和连接的过程已经叙述完毕,我们现在重新回到relation
方法:
def relation #:nodoc:
relation = Relation.new(self, arel_table)
if finder_needs_type_condition?
relation.where(type_condition).create_with(inheritance_column.to_sym => sti_name)
else
relation
end
end
这里的finder_needs_type_condition?
通过判断column中是否有实现STI必要的Column,type
,如果存在则认为这个类有STI:
def finder_needs_type_condition?
# This is like this because benchmarking justifies the strange :false stuff
:true == (@finder_needs_type_condition ||= descends_from_active_record? ? :false : :true)
end
def descends_from_active_record?
if superclass.abstract_class?
superclass.descends_from_active_record?
else
superclass == Base || !columns_hash.include?(inheritance_column)
end
end
# The name of the column containing the object's class when Single Table Inheritance is used
def inheritance_column
if self == Base
'type'
else
(@inheritance_column ||= nil) || superclass.inheritance_column
end
end
最后设置scope.default_scoped
为true,然后返回Relation
对象,并且调用all
方法,all
方法定义在ActiveRecord::FinderMethods
中,activerecord-3.2.13/lib/active_record/relation/finder_methods.rb
内:
# A convenience wrapper for <tt>find(:all, *args)</tt>. You can pass in all the
# same arguments to this method as you can to <tt>find(:all)</tt>.
def all(*args)
args.any? ? apply_finder_options(args.first).to_a : to_a
end
apply_finder_options
定义在ActiveRecord::SpawnMethods
中,这个模块主要负责Relation
对象之间的合并和赋值,源码如下:
def apply_finder_options(options)
relation = clone
return relation unless options
options.assert_valid_keys(VALID_FIND_OPTIONS)
finders = options.dup
finders.delete_if { |key, value| value.nil? && key != :limit }
([:joins, :select, :group, :order, :having, :limit, :offset, :from, :lock, :readonly] & finders.keys).each do |finder|
relation = relation.send(finder, finders[finder])
end
relation = relation.where(finders[:conditions]) if options.has_key?(:conditions)
relation = relation.includes(finders[:include]) if options.has_key?(:include)
relation = relation.extending(finders[:extend]) if options.has_key?(:extend)
relation
end
这个实现不做过多解释,因为很快大家就能明白,直接看to_a
的实现:
def to_a
# We monitor here the entire execution rather than individual SELECTs
# because from the point of view of the user fetching the records of a
# relation is a single unit of work. You want to know if this call takes
# too long, not if the individual queries take too long.
#
# It could be the case that none of the queries involved surpass the
# threshold, and at the same time the sum of them all does. The user
# should get a query plan logged in that case.
logging_query_plan do
exec_queries
end
end
这里logging_query_plan
与SQL Explain有关,主要是当SQL执行超时后执行Adapter的explain
方法,我们这里不再深入学习这个功能,先进入exec_queries
:
def exec_queries
return @records if loaded?
default_scoped = with_default_scope
if default_scoped.equal?(self)
@records = if @readonly_value.nil? && !@klass.locking_enabled?
eager_loading? ? find_with_associations : @klass.find_by_sql(arel, @bind_values)
else
IdentityMap.without do
eager_loading? ? find_with_associations : @klass.find_by_sql(arel, @bind_values)
end
end
preload = @preload_values
preload += @includes_values unless eager_loading?
preload.each do |associations|
ActiveRecord::Associations::Preloader.new(@records, associations).run
end
# @readonly_value is true only if set explicitly. @implicit_readonly is true if there
# are JOINS and no explicit SELECT.
readonly = @readonly_value.nil? ? @implicit_readonly : @readonly_value
@records.each { |record| record.readonly! } if readonly
else
@records = default_scoped.to_a
end
@loaded = true
@records
end
这里看下with_default_scope
的实现,该方法的语意是,如果指定过default_scope
,则返回这个scope:
def with_default_scope
if default_scoped? && default_scope = klass.send(:build_default_scope)
default_scope = default_scope.merge(self)
default_scope.default_scoped = false
default_scope
else
self
end
end
这里的default_scoped?
将返回true,但是本类的build_default_scope
将返回nil,因为并不曾指定过default_scope
,因此with_default_scope
将返回self本身(由于这个方法的重要性,将会在下文再次解析)。这样继续看exec_queries
的实现,这将使得default_scoped
与self相等,因此进入@readonly_value.nil? && [email protected]_enabled?
的判断(之所以做这个判断可能是回避IdentityMap
类的bug,具体请见activerecord-3.2.13/lib/active_record/identity_map.rb
的注释,我们不会深入学习IdentityMap
功能,由于它可能引起很多Bug,已经在Rails 4中被去除)。由于没有@readonly_value
,所以前者返回true,同时column中没有lock_version
这个特殊column,因此locking_enabled?
返回false,所以将判断eager_loading?
:
def eager_loading?
@should_eager_load ||=
@eager_load_values.any? ||
@includes_values.any? && (joined_includes_values.any? || references_eager_loaded_tables?)
end
由于里面提到的变量本次查询都没有设置,因此eager_loading?
返回false。这样就会直接执行@klass.find_by_sql(arel, @bind_values)
。
首先进入arel
,arel
是ActiveRecord::QueryMethods
的方法,定义在ruby-1.9.3-p429/gems/activerecord-3.2.13/lib/active_record/relation/query_methods.rb
中:
def arel
@arel ||= with_default_scope.build_arel
end
这里的with_default_scope
已经解释过,这里的执行结果与之前一致,需要关心的是这里的build_arel
:
def build_arel
arel = table.from table
build_joins(arel, @joins_values) unless @joins_values.empty?
collapse_wheres(arel, (@where_values - ['']).uniq)
arel.having(*@having_values.uniq.reject{|h| h.blank?}) unless @having_values.empty?
arel.take(connection.sanitize_limit(@limit_value)) if @limit_value
arel.skip(@offset_value.to_i) if @offset_value
arel.group(*@group_values.uniq.reject{|g| g.blank?}) unless @group_values.empty?
order = @order_values
order = reverse_sql_order(order) if @reverse_order_value
arel.order(*order.uniq.reject{|o| o.blank?}) unless order.empty?
build_select(arel, @select_values.uniq)
arel.distinct(@uniq_value)
arel.from(@from_value) if @from_value
arel.lock(@lock_value) if @lock_value
arel
end
事实上这里大部分代码并不执行,唯一执行的build_select
也只是横向选择了表中所有列:
def build_select(arel, selects)
unless selects.empty?
@implicit_readonly = false
arel.project(*selects)
else
arel.project(@klass.arel_table[Arel.star])
end
end
这里大部分代码都非常好懂,仅仅是对Arel库的简单调用,因此就不一一解析了。
接着,将执行find_by_sql
方法:
def find_by_sql(sql, binds = [])
logging_query_plan do
connection.select_all(sanitize_sql(sql), "#{name} Load", binds).collect! { |record| instantiate(record) }
end
end
首先关注sanitize_sql
方法,这个方法定义在ActiveRecord::Sanitization
模块内,activerecord-3.2.13/lib/active_record/sanitization.rb
文件内,并且在这个模块内sanitize_sql
方法是sanitize_sql_for_conditions
方法的alias,因此我们看sanitize_sql_for_conditions
:
# Accepts an array, hash, or string of SQL conditions and sanitizes
# them into a valid SQL fragment for a WHERE clause.
# ["name='%s' and group_id='%s'", "foo'bar", 4] returns "name='foo''bar' and group_id='4'"
# { :name => "foo'bar", :group_id => 4 } returns "name='foo''bar' and group_id='4'"
# "name='foo''bar' and group_id='4'" returns "name='foo''bar' and group_id='4'"
def sanitize_sql_for_conditions(condition, table_name = self.table_name)
return nil if condition.blank?
case condition
when Array; sanitize_sql_array(condition)
when Hash; sanitize_sql_hash_for_conditions(condition, table_name)
else condition
end
end
不过ActiveRecord::Sanitization
负责那些需要预处理的SQL语句,而那种情况下参数应该是数组或是哈希,而这里我们传入的是Arel::SelectManager
对象,因此直接返回。
然后我们进入connection.select_all
方法,这个方法分两层,外层是由ActiveRecord::ConnectionAdapters::QueryCache
实现,定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/query_cache.rb
。它生成SQL并且将SQL执行结果缓存起来,而下一层由ActiveRecord::ConnectionAdapters::DatabaseStatements
实现,定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/database_statements.rb
文件内。我们先关心QueryCache
中的实现:
def select_all(arel, name = nil, binds = [])
if @query_cache_enabled && !locked?(arel)
sql = to_sql(arel, binds)
cache_sql(sql, binds) { super(sql, name, binds) }
else
super
end
end
如果SQL Cache功能打开并且数据库没有被锁住的话(后者是因为数据库被锁住情况下执行相同SQL的结果与不锁住情况下的SQL结果可能有所不同),将先取得SQL语句,然后执行并将其结果cache。我们先看下to_sql
的实现:
# Converts an arel AST to SQL
def to_sql(arel, binds = [])
if arel.respond_to?(:ast)
visitor.accept(arel.ast) do
quote(*binds.shift.reverse)
end
else
arel
end
end
这里对visitor
执行accept
方法并且传入之前得到的AST即可获取最终的SQL语句,然后执行cache_sql
:
def cache_sql(sql, binds)
result =
if @query_cache[sql].key?(binds)
ActiveSupport::Notifications.instrument("sql.active_record",
:sql => sql, :binds => binds, :name => "CACHE", :connection_id => object_id)
@query_cache[sql][binds]
else
@query_cache[sql][binds] = yield
end
result.collect { |row| row.dup }
end
从代码中可以看到所有SQL执行的结果均缓存在@query_cache
中,如果存在Cache则直接返回结果,否则执行block中的代码去执行上层DatabaseStatements
中的同名方法:
# Returns an array of record hashes with the column names as keys and
# column values as values.
def select_all(arel, name = nil, binds = [])
select(to_sql(arel, binds), name, binds)
end
这里的to_sql
传入的实际上是已经计算好的SQL,因此并不再次转换,而是由select
执行SQL语句,这里的select
方法定义在SQLite的Adapter类中:
def select(sql, name = nil, binds = [])
exec_query(sql, name, binds).to_a
end
def exec_query(sql, name = nil, binds = [])
log(sql, name, binds) do
# Don't cache statements without bind values
if binds.empty?
stmt = @connection.prepare(sql)
cols = stmt.columns
records = stmt.to_a
stmt.close
stmt = records
else
cache = @statements[sql] ||= {
:stmt => @connection.prepare(sql)
}
stmt = cache[:stmt]
cols = cache[:cols] ||= stmt.columns
stmt.reset!
stmt.bind_params binds.map { |col, val|
type_cast(val, col)
}
end
ActiveRecord::Result.new(cols, stmt.to_a)
end
end
这里的代码完全是针对SQLite库的调用,我们不再研究,这里仅仅需要关心取得需要查询的数据之后如何将其转换为Active Record对象,这里先创建了ActiveRecord::Result
对象:
def initialize(columns, rows)
@columns = columns
@rows = rows
@hash_rows = nil
end
返回ActiveRecord::Result
对象后,返回select
方法,这里将执行to_a
方法。由于ActiveRecord::Result
include了Enumerable
模块,to_a
方法将调用each
返回结果:
def each
hash_rows.each { |row| yield row }
end
这里核心方法是hash_rows
,它将Column和结果集改成了哈希的形式:
def hash_rows
@hash_rows ||=
begin
# We freeze the strings to prevent them getting duped when
# used as keys in ActiveRecord::Model's @attributes hash
columns = @columns.map { |c| c.dup.freeze }
@rows.map { |row|
Hash[columns.zip(row)]
}
end
end
最后,将哈希转换成Active Record对象的工作由find_by_sql
调用的instantiate
方法实现:
# Finder methods must instantiate through this method to work with the
# single-table inheritance model that makes it possible to create
# objects of different types from the same table.
def instantiate(record)
sti_class = find_sti_class(record[inheritance_column])
record_id = sti_class.primary_key && record[sti_class.primary_key]
if ActiveRecord::IdentityMap.enabled? && record_id
instance = use_identity_map(sti_class, record_id, record)
else
instance = sti_class.allocate.init_with('attributes' => record)
end
instance
end
首先搜索需要初始化的类,因此将当前对象的inheritance_column
(通常都是type
)传入find_sti_class
,该方法定义在ActiveRecord::Inheritance
模块内:
def find_sti_class(type_name)
if type_name.blank? || !columns_hash.include?(inheritance_column)
self
else
begin
if store_full_sti_class
ActiveSupport::Dependencies.constantize(type_name)
else
compute_type(type_name)
end
rescue NameError
raise SubclassNotFound,
"The single-table inheritance mechanism failed to locate the subclass: '#{type_name}'. " +
"This error is raised because the column '#{inheritance_column}' is reserved for storing the class in case of inheritance. " +
"Please rename this column if you didn't intend it to be used for storing the inheritance class " +
"or overwrite #{name}.inheritance_column to use another column for that information."
end
end
end
如果没有inheritance_column
,则需要初始化的类就是自己本身,否则则初始化这个column的值代表的类。随后,如果启用了ActiveRecord::IdentityMap
且初始化值中有主键部分,将搜索IdentityMap
,如果有结果,则取出结果并对其重新初始化。如果没有结果或没有启用ActiveRecord::IdentityMap
,则先创建其实例,然后调用init_with
对其初始化:
# Initialize an empty model object from +coder+. +coder+ must contain
# the attributes necessary for initializing an empty model object. For
# example:
#
# class Post < ActiveRecord::Base
# end
#
# post = Post.allocate
# post.init_with('attributes' => { 'title' => 'hello world' })
# post.title # => 'hello world'
def init_with(coder)
@attributes = self.class.initialize_attributes(coder['attributes'])
@relation = nil
@attributes_cache, @previously_changed, @changed_attributes = {}, {}, {}
@association_cache = {}
@aggregation_cache = {}
@readonly = @destroyed = @marked_for_destruction = false
@new_record = false
run_callbacks :find
run_callbacks :initialize
self
end
这里调用了initialize_attributes
方法完成对属性的初始化环节,这里分两层,一层由ActiveRecord::AttributeMethods::Serialization
实现,负责线性化部分属性。另一层由ActiveRecord::Locking::Optimistic
实现,负责控制Column版本。
首先看Serialization
的实现:
def initialize_attributes(attributes, options = {})
serialized = (options.delete(:serialized) { true }) ? :serialized : :unserialized
super(attributes, options)
serialized_attributes.each do |key, coder|
if attributes.key?(key)
attributes[key] = Attribute.new(coder, attributes[key], serialized)
end
end
attributes
end
其中这个模块还包含一段这样的代码:
included do
# Returns a hash of all the attributes that have been specified for serialization as
# keys and their class restriction as values.
class_attribute :serialized_attributes
self.serialized_attributes = {}
end
serialized_attributes
是一个Hash,表示需要线性化的属性及初始化的方法,默认为空。如果传入的选项中没有指定:serialized
为false
或nil
,则在初始化时将serialized_attributes
创建成ActiveRecord::AttributeMethods::Serialization::Attribute
对象,这样就可以调用serialize
方法进行线性化了,关于序列化的细节将在本文后面进行更详细的解析。
接着来看Locking::Optimistic
的部分:
# If the locking column has no default value set,
# start the lock version at zero. Note we can't use
# <tt>locking_enabled?</tt> at this point as
# <tt>@attributes</tt> may not have been initialized yet.
def initialize_attributes(attributes, options = {})
if attributes.key?(locking_column) && lock_optimistically
attributes[locking_column] ||= 0
end
attributes
end
这里只是将属性的locking_column
(默认值是lock_version
)初始化为0。
完成初始化后,返回exec_queries
方法,将处理属性的@preload_values
和@readonly_values
部分,但这里这些值均为空,因此直接返回。
至此,一个简单的User.all
执行完毕。
下面我们将尝试更加复杂的查询条件,更加复杂的Model关系,更加复杂的功能,来更深入的研究Active Record。
#####Find by Id#####
然后,我们简单的加强了搜索条件,这次的代码是:
User.find_by_id 1
find
系列方法是Rails中最常用的搜索方法之一,虽然Rails 4之后find_by_xxxx
系列退化为find_by
方法,但是该方法依然有不错的学习价值。首先,毫无以外的进入了method_missing
方法,该方法定义在ActiveRecord::DynamicMatchers
中,位置在activerecord-3.2.13/lib/active_record/dynamic_matchers.rb
里:
# Enables dynamic finders like <tt>User.find_by_user_name(user_name)</tt> and
# <tt>User.scoped_by_user_name(user_name). Refer to Dynamic attribute-based finders
# section at the top of this file for more detailed information.
#
# It's even possible to use all the additional parameters to +find+. For example, the
# full interface for +find_all_by_amount+ is actually <tt>find_all_by_amount(amount, options)</tt>.
#
# Each dynamic finder using <tt>scoped_by_*</tt> is also defined in the class after it
# is first invoked, so that future attempts to use it do not run through method_missing.
def method_missing(method_id, *arguments, &block)
if match = (DynamicFinderMatch.match(method_id) || DynamicScopeMatch.match(method_id))
attribute_names = match.attribute_names
super unless all_attributes_exists?(attribute_names)
if !(match.is_a?(DynamicFinderMatch) && match.instantiator? && arguments.first.is_a?(Hash)) && arguments.size < attribute_names.size
method_trace = "#{__FILE__}:#{__LINE__}:in `#{method_id}'"
backtrace = [method_trace] + caller
raise ArgumentError, "wrong number of arguments (#{arguments.size} for #{attribute_names.size})", backtrace
end
if match.respond_to?(:scope?) && match.scope?
self.class_eval <<-METHOD, __FILE__, __LINE__ + 1
def self.#{method_id}(*args) # def self.scoped_by_user_name_and_password(*args)
attributes = Hash[[:#{attribute_names.join(',:')}].zip(args)] # attributes = Hash[[:user_name, :password].zip(args)]
gg #
scoped(:conditions => attributes) # scoped(:conditions => attributes)
end # end
METHOD
send(method_id, *arguments)
elsif match.finder?
options = if arguments.length > attribute_names.size
arguments.extract_options!
else
{}
end
relation = options.any? ? scoped(options) : scoped
relation.send :find_by_attributes, match, attribute_names, *arguments, &block
elsif match.instantiator?
scoped.send :find_or_instantiator_by_attributes, match, attribute_names, *arguments, &block
end
else
super
end
end
这里涉及到两个Matcher,一个是DynamicFinderMatch
,另一个是DynamicScopeMatch
,这里我们主要关注DynamicFinderMatch
:
module ActiveRecord
# = Active Record Dynamic Finder Match
#
# Refer to ActiveRecord::Base documentation for Dynamic attribute-based finders for detailed info
#
class DynamicFinderMatch
def self.match(method)
finder = :first
bang = false
instantiator = nil
case method.to_s
when /^find_(all_|last_)?by_([_a-zA-Z]\w*)$/
finder = :last if $1 == 'last_'
finder = :all if $1 == 'all_'
names = $2
when /^find_by_([_a-zA-Z]\w*)\!$/
bang = true
names = $1
when /^find_or_create_by_([_a-zA-Z]\w*)\!$/
bang = true
instantiator = :create
names = $1
when /^find_or_(initialize|create)_by_([_a-zA-Z]\w*)$/
instantiator = $1 == 'initialize' ? :new : :create
names = $2
else
return nil
end
new(finder, instantiator, bang, names.split('_and_'))
end
def initialize(finder, instantiator, bang, attribute_names)
@finder = finder
@instantiator = instantiator
@bang = bang
@attribute_names = attribute_names
end
attr_reader :finder, :attribute_names, :instantiator
def finder?
@finder && !@instantiator
end
def instantiator?
@finder == :first && @instantiator
end
def creator?
@finder == :first && @instantiator == :create
end
def bang?
@bang
end
def save_record?
@instantiator == :create
end
def save_method
bang? ? :save! : :save
end
end
end
这里find_by_id
将匹配第一个when
语句,其中finder
为默认的:first
,意为只搜索第一个结果,随后,这里返回了DynamicFinderMatch
的实例。
随后将确定find_by
中的属性是否确实存在,调用all_attributes_exists?
判断:
def all_attributes_exists?(attribute_names)
(expand_attribute_names_for_aggregates(attribute_names) -
column_methods_hash.keys).empty?
end
# Similar in purpose to +expand_hash_conditions_for_aggregates+.
def expand_attribute_names_for_aggregates(attribute_names)
attribute_names.map { |attribute_name|
unless (aggregation = reflect_on_aggregation(attribute_name.to_sym)).nil?
aggregate_mapping(aggregation).map do |field_attr, _|
field_attr.to_sym
end
else
attribute_name.to_sym
end
}.flatten
end
此方法主要是为了AggregateReflection
而存在,将composed_of
中:mapping
选项映射得到所有属性。该方法不在本文解析的范畴内。
而column_methods_hash
则尽可能的返回更多的可能方法:
# Returns a hash of all the methods added to query each of the columns in the table with the name of the method as the key
# and true as the value. This makes it possible to do O(1) lookups in respond_to? to check if a given method for attribute
# is available.
def column_methods_hash
@dynamic_methods_hash ||= column_names.inject(Hash.new(false)) do |methods, attr|
attr_name = attr.to_s
methods[attr.to_sym] = attr_name
methods["#{attr}=".to_sym] = attr_name
methods["#{attr}?".to_sym] = attr_name
methods["#{attr}_before_type_cast".to_sym] = attr_name
methods
end
end
二者相减如果不为空则说明参数中含有不存在的属性,将返回错误。返回method_missing
方法,随即则是一个判断参数是否过多的检查。接下来主要是一个分支,对于DynamicScopeMatch
的情况(scope?
存在),则创建一个同名方法,并且对该方法进行调用。在这个同名方法中则主要是针对scoped
方法的调用。而对于我们目前关心的DynamicFinderMatch
,将先获取到选项,随后调用scoped
方法处理选项以获取正确的scope,最后调用Relation
对象的find_by_attributes
方法即可:
def find_by_attributes(match, attributes, *args)
conditions = Hash[attributes.map {|a| [a, args[attributes.index(a)]]}]
result = where(conditions).send(match.finder)
if match.bang? && result.nil?
raise RecordNotFound, "Couldn't find #{@klass.name} with #{conditions.to_a.collect {|p| p.join(' = ')}.join(', ')}"
else
yield(result) if block_given?
result
end
end
首先生成了属性的key value对,然后将其放入where
方法内,随后对返回值再调用match.finder
方法,match.finder
在这里取值为:first
,其他可能的取值还有:last
和:all
。下面是where
方法的代码:
def where(opts, *rest)
return self if opts.blank?
relation = clone
relation.where_values += build_where(opts, rest)
relation
end
def build_where(opts, other = [])
case opts
when String, Array
[@klass.send(:sanitize_sql, other.empty? ? opts : ([opts] + other))]
when Hash
attributes = @klass.send(:expand_hash_conditions_for_aggregates, opts)
PredicateBuilder.build_from_hash(table.engine, attributes, table)
else
[opts]
end
end
这里的ActiveRecord::PredicateBuilder
类定义在activerecord-3.2.13/lib/active_record/relation/predicate_builder.rb
中,提供了关于这类断言式的代码封装:
module ActiveRecord
class PredicateBuilder # :nodoc:
def self.build_from_hash(engine, attributes, default_table, allow_table_name = true)
predicates = attributes.map do |column, value|
table = default_table
if allow_table_name && value.is_a?(Hash)
table = Arel::Table.new(column, engine)
if value.empty?
'1 = 2'
else
build_from_hash(engine, value, table, false)
end
else
column = column.to_s
if allow_table_name && column.include?('.')
table_name, column = column.split('.', 2)
table = Arel::Table.new(table_name, engine)
end
attribute = table[column]
case value
when ActiveRecord::Relation
value = value.select(value.klass.arel_table[value.klass.primary_key]) if value.select_values.empty?
attribute.in(value.arel.ast)
when Array, ActiveRecord::Associations::CollectionProxy
values = value.to_a.map {|x| x.is_a?(ActiveRecord::Base) ? x.id : x}
ranges, values = values.partition {|v| v.is_a?(Range) || v.is_a?(Arel::Relation)}
array_predicates = ranges.map {|range| attribute.in(range)}
if values.include?(nil)
values = values.compact
if values.empty?
array_predicates << attribute.eq(nil)
else
array_predicates << attribute.in(values.compact).or(attribute.eq(nil))
end
else
array_predicates << attribute.in(values)
end
array_predicates.inject {|composite, predicate| composite.or(predicate)}
when Range, Arel::Relation
attribute.in(value)
when ActiveRecord::Base
attribute.eq(value.id)
when Class
# FIXME: I think we need to deprecate this behavior
attribute.eq(value.name)
else
attribute.eq(value)
end
end
end
predicates.flatten
end
end
end
虽然这里的代码看上去非常复杂,几乎所有与where子句相关的SQL语句的功能都在这里被封装。但是我们需要的仅仅是调用属性的eq
方法,并传入属性对应的值。最后将获取一个数组,其元素是一个Arel::Nodes::Equality
对象。
first
方法的实现也非常简单:
# A convenience wrapper for <tt>find(:first, *args)</tt>. You can pass in all the
# same arguments to this method as you can to <tt>find(:first)</tt>.
def first(*args)
if args.any?
if args.first.kind_of?(Integer) || (loaded? && !args.first.kind_of?(Hash))
limit(*args).to_a
else
apply_finder_options(args.first).first
end
else
find_first
end
end
def find_first
if loaded?
@records.first
else
@first ||= limit(1).to_a[0]
end
end
first
可以接受一个数字返回最前的多条数据,否则执行find_first
方法。这个方法实质就是调用limit(1).to_a[0]
语句,其中limit
方法的实现非常简单:
def limit(value)
relation = clone
relation.limit_value = value
relation
end
而to_a
的实现之前已经解释过,这里不再重复。
#####Find by Parameters#####
我们已经了解了find_by
系列方法的内部机制,下面将进入更加复杂的查询方法,带参数绑定的命名查询,查询代码如下:
User.where 'id = :id and name = :name and age = :age and admin = :admin', :id => id,
:name => name,
:age => age,
:admin => admin
首先进入where
方法:
def where(opts, *rest)
return self if opts.blank?
relation = clone
relation.where_values += build_where(opts, rest)
relation
end
def build_where(opts, other = [])
case opts
when String, Array
[@klass.send(:sanitize_sql, other.empty? ? opts : ([opts] + other))]
when Hash
attributes = @klass.send(:expand_hash_conditions_for_aggregates, opts)
PredicateBuilder.build_from_hash(table.engine, attributes, table)
else
[opts]
end
end
这两个方法其实之前已经读过,但是现在我们将探索build_where
的第一个分支,sanitize_sql
方法:
# Accepts an array, hash, or string of SQL conditions and sanitizes
# them into a valid SQL fragment for a WHERE clause.
# ["name='%s' and group_id='%s'", "foo'bar", 4] returns "name='foo''bar' and group_id='4'"
# { :name => "foo'bar", :group_id => 4 } returns "name='foo''bar' and group_id='4'"
# "name='foo''bar' and group_id='4'" returns "name='foo''bar' and group_id='4'"
def sanitize_sql_for_conditions(condition, table_name = self.table_name)
return nil if condition.blank?
case condition
when Array; sanitize_sql_array(condition)
when Hash; sanitize_sql_hash_for_conditions(condition, table_name)
else condition
end
end
alias_method :sanitize_sql, :sanitize_sql_for_conditions
虽然这个方法之前也已经接触过,但是之前并没有详细解析,这里我们将重点研究这个方法,由于带参数绑定的方法传入的condition
参数都是数组,因此进入sanitize_sql_array
方法:
# Accepts an array of conditions. The array has each value
# sanitized and interpolated into the SQL statement.
# ["name='%s' and group_id='%s'", "foo'bar", 4] returns "name='foo''bar' and group_id='4'"
def sanitize_sql_array(ary)
statement, *values = ary
if values.first.is_a?(Hash) && statement =~ /:\w+/
replace_named_bind_variables(statement, values.first)
elsif statement.include?('?')
replace_bind_variables(statement, values)
elsif statement.blank?
statement
else
statement % values.collect { |value| connection.quote_string(value.to_s) }
end
end
由于本例中我们用了命名参数,将匹配第一个条件,将调用replace_named_bind_variables
方法:
def replace_named_bind_variables(statement, bind_vars)
statement.gsub(/(:?):([a-zA-Z]\w*)/) do
if $1 == ':' # skip postgresql casts
$& # return the whole match
elsif bind_vars.include?(match = $2.to_sym)
quote_bound_value(bind_vars[match])
else
raise PreparedStatementInvalid, "missing value for :#{match} in #{statement}"
end
end
end
由于PostgreSQL存在连续两个冒号的语句,因此需要适当规避,随后将传入参数对应的值传入quote_bound_value
方法:
def quote_bound_value(value, c = connection)
if value.respond_to?(:map) && !value.acts_like?(:string)
if value.respond_to?(:empty?) && value.empty?
c.quote(nil)
else
value.map { |v| c.quote(v) }.join(',')
end
else
c.quote(value)
end
end
这里主要是将获取的值增加引号,调用的方法是connection.quote
,由于SQLite在这方面基本遵守标准,因此将进入ActiveRecord::ConnectionAdapters::Quoting
,该模块负责各种与引号相关的实用功能,定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/quoting.rb
中:
# Quotes the column value to help prevent
# {SQL injection attacks}[http://en.wikipedia.org/wiki/SQL_injection].
def quote(value, column = nil)
# records are quoted as their primary key
return value.quoted_id if value.respond_to?(:quoted_id)
case value
when String, ActiveSupport::Multibyte::Chars
value = value.to_s
return "'#{quote_string(value)}'" unless column
case column.type
when :binary then "'#{quote_string(column.string_to_binary(value))}'"
when :integer then value.to_i.to_s
when :float then value.to_f.to_s
else
"'#{quote_string(value)}'"
end
when true, false
if column && column.type == :integer
value ? '1' : '0'
else
value ? quoted_true : quoted_false
end
# BigDecimals need to be put in a non-normalized form and quoted.
when nil then "NULL"
when BigDecimal then value.to_s('F')
when Numeric then value.to_s
when Date, Time then "'#{quoted_date(value)}'"
when Symbol then "'#{quote_string(value.to_s)}'"
else
"'#{quote_string(YAML.dump(value))}'"
end
end
可以看到这里根据数据类型划分了多个增加引号的方法,虽然凡是涉及到字符串都用单引号引起,但是对于处理字符串内部的引号的手段却各不相同,这里就不再解析了。
接下来经过多次循环,之前SQL语句中的参数均会被实际值替代,这样,最终的SQL语句将会被加入到Relation
对象中,并合并到解析后的SQL语句中去。
####Relations####
随后,让我们来关注Active Record关于Relation的部分,代码是:
class User < ActiveRecord::Base
has_many :blogs
end
user.blogs.to_a
这段代码简单地为User类定义了一个has_many
关系,随后调用了这个关系进行查询。我们现在将从定义关系的代码开始解析:
def has_many(name, options = {}, &extension)
Builder::HasMany.build(self, name, options, &extension)
end
这些代码在ActiveRecord::Associations
的ClassMethods
中被定义。代码中提到了Builder::HasMany
类,从名字可知,这个类负责建立has_many
关系,祖先是同一模块下定义的CollectionAssociation
和Association
类。这里调用的build
方法定义在CollectionAssociation
类中:
def self.build(model, name, options, &extension)
new(model, name, options, &extension).build
end
这里初始化了Builder::HasMany
类的对象,随即调用了它的build
方法,这里的build
分多个层次,最核心的是Builder::Association
的定义:
def build
validate_options
reflection = model.create_reflection(self.class.macro, name, options, model)
define_accessors
reflection
end
validate_options
验证所有传入的key是否valid,代码非常简单,无需解释。随后调用了model
(这里指的是User
类)的create_reflection
方法,该方法定义在ActiveRecord::Reflection
模块中,主要是创建各种Reflection
对象:
def create_reflection(macro, name, options, active_record)
case macro
when :has_many, :belongs_to, :has_one, :has_and_belongs_to_many
klass = options[:through] ? ThroughReflection : AssociationReflection
reflection = klass.new(macro, name, options, active_record)
when :composed_of
reflection = AggregateReflection.new(macro, name, options, active_record)
end
self.reflections = self.reflections.merge(name => reflection)
reflection
end
可以注意到这里出现了三种Reflection
类,分别是专门用于:through
选项的ThroughReflection
,比较通用的AssociationReflection
和用于:composed_of
选项的AggregateReflection
。各自都继承于父类MacroReflection
(由于定义不是很复杂,因此都和Reflection
模块定义在同一个文件中)。这里我们将用到AssociationReflection
,并且定义它的实例。随后将Relation
的名字和Reflection
的实例放入self.reflection
哈希中,以便之后查询。
之后的define_accessors
主要定义对这一Relation
的reader
和writer
方法,其中包括读写其对象和只读写其id数组的方法。
然后是Builder::CollectionAssociation
的build
方法:
def build
wrap_block_extension
reflection = super
CALLBACKS.each { |callback_name| define_callback(callback_name) }
reflection
end
wrap_block_extension
只是将:extend
对应的模块保存起来,当前方法主要负责定义Callback,这里包括四个基本Callback方法::before_add, :after_add, :before_remove, :after_remove
。
接着是被HasMany
模块包含的ActiveRecord::AutosaveAssociation
模块,这个模块的作用是为Active Record的关联对象添加自动保存的功能:
def build
reflection = super
model.send(:add_autosave_association_callbacks, reflection)
reflection
end
# Adds validation and save callbacks for the association as specified by
# the +reflection+.
#
# For performance reasons, we don't check whether to validate at runtime.
# However the validation and callback methods are lazy and those methods
# get created when they are invoked for the very first time. However,
# this can change, for instance, when using nested attributes, which is
# called _after_ the association has been defined. Since we don't want
# the callbacks to get defined multiple times, there are guards that
# check if the save or validation methods have already been defined
# before actually defining them.
def add_autosave_association_callbacks(reflection)
save_method = :"autosave_associated_records_for_#{reflection.name}"
validation_method = :"validate_associated_records_for_#{reflection.name}"
collection = reflection.collection?
unless method_defined?(save_method)
if collection
before_save :before_save_collection_association
define_non_cyclic_method(save_method, reflection) { save_collection_association(reflection) }
# Doesn't use after_save as that would save associations added in after_create/after_update twice
after_create save_method
after_update save_method
else
if reflection.macro == :has_one
define_method(save_method) { save_has_one_association(reflection) }
# Configures two callbacks instead of a single after_save so that
# the model may rely on their execution order relative to its
# own callbacks.
#
# For example, given that after_creates run before after_saves, if
# we configured instead an after_save there would be no way to fire
# a custom after_create callback after the child association gets
# created.
after_create save_method
after_update save_method
else
define_non_cyclic_method(save_method, reflection) { save_belongs_to_association(reflection) }
before_save save_method
end
end
end
if reflection.validate? && !method_defined?(validation_method)
method = (collection ? :validate_collection_association : :validate_single_association)
define_non_cyclic_method(validation_method, reflection) { send(method, reflection) }
validate validation_method
end
end
这个函数主要是增加一些callback,其中包括对于collection的关联增加保存前的callback方法before_save_collection_association
:
# Is used as a before_save callback to check while saving a collection
# association whether or not the parent was a new record before saving.
def before_save_collection_association
@new_record_before_save = new_record?
true
end
当保存和升级之后则回调save_collection_association
,save_has_one_association
,或是save_belongs_to_association
方法,具体代码将在稍后解析。
随后才进入Builder::HasMany
定义的build
方法:
def build
reflection = super
configure_dependency
reflection
end
def configure_dependency
if options[:dependent]
unless options[:dependent].in?([:destroy, :delete_all, :nullify, :restrict])
raise ArgumentError, "The :dependent option expects either :destroy, :delete_all, " \
":nullify or :restrict (#{options[:dependent].inspect})"
end
send("define_#{options[:dependent]}_dependency_method")
model.before_destroy dependency_method_name
end
end
这个方法在:dependent
选项被指定时增加before_destroy
的callback方法,具体代码较为简单,这里不再解析。
随后让我们来执行下user.blogs
语句,看看里面的原理,首先,先前我们没有提及的是,define_accessors
的具体代码,而事实上,这是读写关联对象的入口:
def define_accessors
define_readers
define_writers
end
def define_readers
name = self.name
mixin.redefine_method(name) do |*params|
association(name).reader(*params)
end
end
def define_writers
name = self.name
mixin.redefine_method("#{name}=") do |value|
association(name).writer(value)
end
end
从代码中可以看到,在build
方法执行时会定义两个方法,分别是对属性的reader
和writer
。因此当我们执行user.blogs
的时候,将执行之前定义的blogs
的reader
的方法。首先让我们进入association
方法:
# Returns the association instance for the given name, instantiating it if it doesn't already exist
def association(name)
association = association_instance_get(name)
if association.nil?
reflection = self.class.reflect_on_association(name)
association = reflection.association_class.new(self, reflection)
association_instance_set(name, association)
end
association
end
其中association_instance_get
和association_instance_set
起到类似于缓存的作用,非常简单,这里不再解析。直接进入reflect_on_association
方法:
# Returns the AssociationReflection object for the +association+ (use the symbol).
#
# Account.reflect_on_association(:owner) # returns the owner AssociationReflection
# Invoice.reflect_on_association(:line_items).macro # returns :has_many
#
def reflect_on_association(association)
reflections[association].is_a?(AssociationReflection) ? reflections[association] : nil
end
实际上就是从前面存入的self.reflections
哈希中取出Reflection
对象。
随后根据这个对象用association_class
方法可以取得对应的Association
类:
def association_class
case macro
when :belongs_to
if options[:polymorphic]
Associations::BelongsToPolymorphicAssociation
else
Associations::BelongsToAssociation
end
when :has_and_belongs_to_many
Associations::HasAndBelongsToManyAssociation
when :has_many
if options[:through]
Associations::HasManyThroughAssociation
else
Associations::HasManyAssociation
end
when :has_one
if options[:through]
Associations::HasOneThroughAssociation
else
Associations::HasOneAssociation
end
end
end
这里将会取得Associations::HasManyAssociation
类,随后就创建该类的实例:
# CollectionAssociation initialize:
def initialize(owner, reflection)
super
@proxy = CollectionProxy.new(self)
end
# Association initialize:
def initialize(owner, reflection)
reflection.check_validity!
@target = nil
@owner, @reflection = owner, reflection
@updated = false
reset
reset_scope
end
注意初始化时会创建CollectionProxy
对象。
首先检查reflection
的正确性:
def check_validity!
check_validity_of_inverse!
end
def check_validity_of_inverse!
unless options[:polymorphic]
if has_inverse? && inverse_of.nil?
raise InverseOfAssociationNotFoundError.new(self)
end
end
end
这里只是简单的要求:polymorphic
选项和inverse_of
不能同时存在。接着初始化一些变量:
def reset
@loaded = false
@target = []
end
def reset_scope
@association_scope = nil
end
随后我们开始调用HasManyAssociation
的方法reader
,注意这个方法定义在ActiveRecord::Associations::Association
模块中,注意区分这个模块和Builder::Association
:
def reader(force_reload = false)
if force_reload
klass.uncached { reload }
elsif stale_target?
reload
end
proxy
end
随后进入proxy
的to_ary
方法,该方法定义在前面提到的CollectionProxy
中:
def to_ary
load_target.dup
end
alias_method :to_a, :to_ary
这里load_target
是代理方法,实际上将调用ActiveRecord::Associations::HasManyAssociation
对象的同名方法:
def load_target
if find_target?
@target = merge_target_lists(find_target, target)
end
loaded!
target
end
首先确定@target
是否已经被load,条件是:
def find_target?
!loaded? && (!owner.new_record? || foreign_key_present?) && klass
end
随后将正式find_target
:
def find_target
records =
if options[:finder_sql]
reflection.klass.find_by_sql(custom_finder_sql)
else
scoped.all
end
records = options[:uniq] ? uniq(records) : records
records.each { |record| set_inverse_instance(record) }
records
end
可以看到,这里已经接近核心的对数据库的搜索,由于没有指定:find_sql
,这里讲执行scoped.all
,将所有符合条件的对象都查询出来:
def scoped
target_scope.merge(association_scope)
end
这里的scoped
与之前ActiveRecord::Scoping::Named
的不同,是两个scope的合并,首先是target_scope
:
def target_scope
klass.scoped
end
target_scope
该类本身的scope,也就是之前ActiveRecord::Scoping::Named
的实现,这里不再复述。随后是association_scope
:
# The scope for this association.
#
# Note that the association_scope is merged into the target_scope only when the
# scoped method is called. This is because at that point the call may be surrounded
# by scope.scoping { ... } or with_scope { ... } etc, which affects the scope which
# actually gets built.
def association_scope
if klass
@association_scope ||= AssociationScope.new(self).scope
end
end
这里又提到一个新的类AssociationScope
,主要是针对外键查询部分的代码实现,这里将创建该类的实例:
def initialize(association)
@association = association
@alias_tracker = AliasTracker.new klass.connection
end
随后调用它的scope
方法:
def scope
scope = klass.unscoped
scope = scope.extending(*Array.wrap(options[:extend]))
# It's okay to just apply all these like this. The options will only be present if the
# association supports that option; this is enforced by the association builder.
scope = scope.apply_finder_options(options.slice(
:readonly, :include, :order, :limit, :joins, :group, :having, :offset, :select))
if options[:through] && !options[:include]
scope = scope.includes(source_options[:include])
end
scope = scope.uniq if options[:uniq]
add_constraints(scope)
end
这里的unscoped
表示暂时去除所有之前设置的默认scope,返回一个纯净的scope,这个方法定义在ActiveRecord::Scoping::Default
中:
# Returns a scope for the model without the default_scope.
def unscoped
block_given? ? relation.scoping { yield } : relation
end
可以看到这里将调用relation
方法重新创建一个Relation
对象。随后的scope
中的代码则是对scope添加多种查询条件,在我们的例子中,可以不用看之前那些条件,只需要关心最后一个方法即可:
def add_constraints(scope)
tables = construct_tables
chain.each_with_index do |reflection, i|
table, foreign_table = tables.shift, tables.first
if reflection.source_macro == :has_and_belongs_to_many
join_table = tables.shift
scope = scope.joins(join(
join_table,
table[reflection.association_primary_key].
eq(join_table[reflection.association_foreign_key])
))
table, foreign_table = join_table, tables.first
end
if reflection.source_macro == :belongs_to
if reflection.options[:polymorphic]
key = reflection.association_primary_key(klass)
else
key = reflection.association_primary_key
end
foreign_key = reflection.foreign_key
else
key = reflection.foreign_key
foreign_key = reflection.active_record_primary_key
end
conditions = self.conditions[i]
if reflection == chain.last
scope = scope.where(table[key].eq(owner[foreign_key]))
if reflection.type
scope = scope.where(table[reflection.type].eq(owner.class.base_class.name))
end
conditions.each do |condition|
if options[:through] && condition.is_a?(Hash)
condition = disambiguate_condition(table, condition)
end
scope = scope.where(interpolate(condition))
end
else
constraint = table[key].eq(foreign_table[foreign_key])
if reflection.type
type = chain[i + 1].klass.base_class.name
constraint = constraint.and(table[reflection.type].eq(type))
end
scope = scope.joins(join(foreign_table, constraint))
unless conditions.empty?
scope = scope.where(sanitize(conditions, table))
end
end
end
scope
end
首先,调用construct_tables
方法创建一个Arel::Table
对象,construct_tables
定义在ActiveRecord::Associations::JoinHelper
中:
def construct_tables
tables = []
chain.each do |reflection|
tables << alias_tracker.aliased_table_for(
table_name_for(reflection),
table_alias_for(reflection, reflection != self.reflection)
)
if reflection.source_macro == :has_and_belongs_to_many
tables << alias_tracker.aliased_table_for(
(reflection.source_reflection || reflection).options[:join_table],
table_alias_for(reflection, true)
)
end
end
tables
end
这里先计算了table name和table alias,计算方法如下:
def table_name_for(reflection)
reflection.table_name
end
def table_alias_for(reflection, join = false)
name = "#{reflection.plural_name}_#{alias_suffix}"
name << "_join" if join
name
end
随后,调用AliasTracker
对象来创建Arel::Table
,之所以用这个类是在join的时候防止alias重复,AliasTracker#aliased_table_for
的实现如下:
def aliased_table_for(table_name, aliased_name = nil)
table_alias = aliased_name_for(table_name, aliased_name)
if table_alias == table_name
Arel::Table.new(table_name)
else
Arel::Table.new(table_name).alias(table_alias)
end
end
def aliased_name_for(table_name, aliased_name = nil)
aliased_name ||= table_name
if aliases[table_name].zero?
# If it's zero, we can have our table_name
aliases[table_name] = 1
table_name
else
# Otherwise, we need to use an alias
aliased_name = connection.table_alias_for(aliased_name)
# Update the count
aliases[aliased_name] += 1
if aliases[aliased_name] > 1
"#{truncate(aliased_name)}_#{aliases[aliased_name]}"
else
aliased_name
end
end
end
这里如果aliases
表中已经存在alias,则生成另一个alias代替。如果是:has_and_belongs_to_many
关系还需要再生成一张中间表,但这里我们不需要。
随后回到add_constraints
,这里生成join查询所需的外键名和主键名,其中生成的方法分别是relation
的foreign_key
和active_record_primary_key
:
def foreign_key
@foreign_key ||= options[:foreign_key] || derive_foreign_key
end
def active_record_primary_key
@active_record_primary_key ||= options[:primary_key] || primary_key(active_record)
end
这两个方法默认都会使用选项里的参数,其中primary_key
的fallback就是调用Active Record对象的primary_key
,而foreign_key
的fallback方法是:
def derive_foreign_key
if belongs_to?
"#{name}_id"
elsif options[:as]
"#{options[:as]}_id"
else
active_record.name.foreign_key
end
end
可以看到,完全依照Rails的约定来生成。
随后,调用scope
的where
方法增加查询条件,如果有type
列还必须增加对于type的查询条件,如果有其他查询条件的话也一并加上,这样scope
方法就返回了一个已经包含全部查询条件的scope。
随后,调用merge
将两个scope合并在一起,merge
方法这里的定义在ActiveRecord::SpawnMethods
:
def merge(r)
return self unless r
return to_a & r if r.is_a?(Array)
merged_relation = clone
r = r.with_default_scope if r.default_scoped? && r.klass != klass
Relation::ASSOCIATION_METHODS.each do |method|
value = r.send(:"#{method}_values")
unless value.empty?
if method == :includes
merged_relation = merged_relation.includes(value)
else
merged_relation.send(:"#{method}_values=", value)
end
end
end
(Relation::MULTI_VALUE_METHODS - [:joins, :where, :order]).each do |method|
value = r.send(:"#{method}_values")
merged_relation.send(:"#{method}_values=", merged_relation.send(:"#{method}_values") + value) if value.present?
end
merged_relation.joins_values += r.joins_values
merged_wheres = @where_values + r.where_values
unless @where_values.empty?
# Remove duplicates, last one wins.
seen = Hash.new { |h,table| h[table] = {} }
merged_wheres = merged_wheres.reverse.reject { |w|
nuke = false
if w.respond_to?(:operator) && w.operator == :==
name = w.left.name
table = w.left.relation.name
nuke = seen[table][name]
seen[table][name] = true
end
nuke
}.reverse
end
merged_relation.where_values = merged_wheres
(Relation::SINGLE_VALUE_METHODS - [:lock, :create_with, :reordering]).each do |method|
value = r.send(:"#{method}_value")
merged_relation.send(:"#{method}_value=", value) unless value.nil?
end
merged_relation.lock_value = r.lock_value unless merged_relation.lock_value
merged_relation = merged_relation.create_with(r.create_with_value) unless r.create_with_value.empty?
if (r.reordering_value)
# override any order specified in the original relation
merged_relation.reordering_value = true
merged_relation.order_values = r.order_values
else
# merge in order_values from r
merged_relation.order_values += r.order_values
end
# Apply scope extension modules
merged_relation.send :apply_modules, r.extensions
merged_relation
end
方法虽然长,但其实只是简单地一一赋值而已,这里不详细解析。
回到find_target
方法,调用scoped
的all
方法进行实际的数据库查询,具体查询过程参考之前的exec_queries
方法。
获得实际的Active Record对象之后,调用merge_target_lists
将之前的find_target
的结果和target
合并:
# We have some records loaded from the database (persisted) and some that are
# in-memory (memory). The same record may be represented in the persisted array
# and in the memory array.
#
# So the task of this method is to merge them according to the following rules:
#
# * The final array must not have duplicates
# * The order of the persisted array is to be preserved
# * Any changes made to attributes on objects in the memory array are to be preserved
# * Otherwise, attributes should have the value found in the database
def merge_target_lists(persisted, memory)
return persisted if memory.empty?
return memory if persisted.empty?
persisted.map! do |record|
# Unfortunately we cannot simply do memory.delete(record) since on 1.8 this returns
# record rather than memory.at(memory.index(record)). The behavior is fixed in 1.9.
mem_index = memory.index(record)
if mem_index
mem_record = memory.delete_at(mem_index)
((record.attribute_names & mem_record.attribute_names) - mem_record.changes.keys).each do |name|
mem_record[name] = record[name]
end
mem_record
else
record
end
end
persisted + memory
end
合并主要是去重复和拷贝那些被修改过的数据到目标对象,完成之后调用loaded!
将该HasManyAssociation
设置成loaded,返回被查询的数据即可。
这将是一个更加复杂的Relation,解析这个关系将使我们对Rails的Relation有更加深刻的理解。
class User < ActiveRecord::Base
has_and_belongs_to_many :followers, class_name: 'User', foreign_key: 'follow_id', association_foreign_key: 'follower_id', :join_table => 'follows'
has_many :followers_comments, through: :followers, :source => :comments
has_many :comments
end
user1.followers_comments
首先让我们进入has_and_belongs_to_many
方法:
def has_and_belongs_to_many(name, options = {}, &extension)
Builder::HasAndBelongsToMany.build(self, name, options, &extension)
end
和has_many
的实现看上去比较相似,其中build
的实现是:
def build
reflection = super
check_validity(reflection)
define_destroy_hook
reflection
end
由于HasAndBelongsToMany
的父类与HasMany
的父类一致,都是CollectionAssociation
,所以这里super
的调用就不再详细解析。
check_validity
的实现是:
def check_validity(reflection)
if reflection.association_foreign_key == reflection.foreign_key
raise ActiveRecord::HasAndBelongsToManyAssociationForeignKeyNeeded.new(reflection)
end
reflection.options[:join_table] ||= join_table_name(
model.send(:undecorated_table_name, model.to_s),
model.send(:undecorated_table_name, reflection.class_name)
)
end
首先:association_foreign_key
和:foreign_key
的内容不能完全一致,否则毫无意义。随后生成的是由两个表名合并成为的Join表的名字,如果之前没有预设的话。合并的方法是join_table_name
:
# Generates a join table name from two provided table names.
# The names in the join table names end up in lexicographic order.
#
# join_table_name("members", "clubs") # => "clubs_members"
# join_table_name("members", "special_clubs") # => "members_special_clubs"
def join_table_name(first_table_name, second_table_name)
if first_table_name < second_table_name
join_table = "#{first_table_name}_#{second_table_name}"
else
join_table = "#{second_table_name}_#{first_table_name}"
end
model.table_name_prefix + join_table + model.table_name_suffix
end
字符串较小的表名将放在前面。
随后,将定义删除后的hook,方法是define_destroy_hook
:
def define_destroy_hook
name = self.name
model.send(:include, Module.new {
class_eval <<-RUBY, __FILE__, __LINE__ + 1
def destroy_associations
association(#{name.to_sym.inspect}).delete_all_on_destroy
super
end
RUBY
})
end
接着进入user1.followers_comments
,由于该Relation依然是:has_many
,即使增加了:through
选项,也仅仅是改用ThroughReflection
类并创建HasManyThroughAssociation
类的对象(这两点之前的解析中均有提及),而HasManyThroughAssociation
还是HasManyAssociation
的子类,因此初始化部分代码完全一致,因此也不再复述。我们将直接从find_target
方法开始,这个方法定义在HasManyThroughAssociation
中:
def find_target
return [] unless target_reflection_has_associated_record?
scoped.all
end
这里增加了一个方法判断target_reflection_has_associated_record?
:
def target_reflection_has_associated_record?
if through_reflection.macro == :belongs_to && owner[through_reflection.foreign_key].blank?
false
else
true
end
end
当Relation是:belongs_to
但是对象外键对应的值却是空,结果一定不存在,直接返回false即可。不过我们这里的关系是:has_and_belongs_to_many
,因此总是返回true。
scoped
的代码依然是两个scope的合并:
def scoped
target_scope.merge(association_scope)
end
但是定义已经截然不同,这里target_scope
方法定义在ThroughAssociation
模块中,覆盖了原来Association
中的定义,因此适用于所有存在:through
选项的关系:
# We merge in these scopes for two reasons:
#
# 1. To get the default_scope conditions for any of the other reflections in the chain
# 2. To get the type conditions for any STI models in the chain
def target_scope
scope = super
chain[1..-1].each do |reflection|
scope = scope.merge(
reflection.klass.scoped.with_default_scope.
except(:select, :create_with, :includes, :preload, :joins, :eager_load)
)
end
scope
end
chain
方法在这里也同样经过改造,改造位置在ThroughReflection
类中:
# Returns an array of reflections which are involved in this association. Each item in the
# array corresponds to a table which will be part of the query for this association.
#
# The chain is built by recursively calling #chain on the source reflection and the through
# reflection. The base case for the recursion is a normal association, which just returns
# [self] as its #chain.
def chain
@chain ||= begin
chain = source_reflection.chain + through_reflection.chain
chain[0] = self # Use self so we don't lose the information from :source_type
chain
end
end
这里的source_reflection
和through_reflection
分别是:has_many
关系对应的两个reflection,实现分别是:
def source_reflection
@source_reflection ||= source_reflection_names.collect { |name| through_reflection.klass.reflect_on_association(name) }.compact.first
end
def through_reflection
@through_reflection ||= active_record.reflect_on_association(options[:through])
end
随后这里又对chain的第一个值,也就是source_reflection.chain
的第一个结果用当前值,也就是ActiveRecord::Reflection::ThroughReflection
对象取代,以避免丢失一些属性。
随后,利用前面取得的scope,与chain中后面那些通过:through
连接的reflection的scope一一合并。合并方法之前已经解析过,并且这里的合并并不是重点,因此不再解析。
association_scope
的实现才是真正的重点,虽然大部分代码之前也已经解析,但是add_constraints
中对has_and_belongs_to_many
的处理依然要细讲:
def add_constraints(scope)
tables = construct_tables
chain.each_with_index do |reflection, i|
table, foreign_table = tables.shift, tables.first
if reflection.source_macro == :has_and_belongs_to_many
join_table = tables.shift
scope = scope.joins(join(
join_table,
table[reflection.association_primary_key].
eq(join_table[reflection.association_foreign_key])
))
table, foreign_table = join_table, tables.first
end
if reflection.source_macro == :belongs_to
if reflection.options[:polymorphic]
key = reflection.association_primary_key(klass)
else
key = reflection.association_primary_key
end
foreign_key = reflection.foreign_key
else
key = reflection.foreign_key
foreign_key = reflection.active_record_primary_key
end
conditions = self.conditions[i]
if reflection == chain.last
scope = scope.where(table[key].eq(owner[foreign_key]))
if reflection.type
scope = scope.where(table[reflection.type].eq(owner.class.base_class.name))
end
conditions.each do |condition|
if options[:through] && condition.is_a?(Hash)
condition = disambiguate_condition(table, condition)
end
scope = scope.where(interpolate(condition))
end
else
constraint = table[key].eq(foreign_table[foreign_key])
if reflection.type
type = chain[i + 1].klass.base_class.name
constraint = constraint.and(table[reflection.type].eq(type))
end
scope = scope.joins(join(foreign_table, constraint))
unless conditions.empty?
scope = scope.where(sanitize(conditions, table))
end
end
end
scope
end
首先,我们再看一下construct_tables
的代码:
def construct_tables
tables = []
chain.each do |reflection|
tables << alias_tracker.aliased_table_for(
table_name_for(reflection),
table_alias_for(reflection, reflection != self.reflection)
)
if reflection.source_macro == :has_and_belongs_to_many
tables << alias_tracker.aliased_table_for(
(reflection.source_reflection || reflection).options[:join_table],
table_alias_for(reflection, true)
)
end
end
tables
end
从代码中我们注意到,如果reflection是:has_and_belongs_to_many
的,将会有两张表对象被创建出来,其中一张表是Join表。
在处理第一个chain,:has_many
的reflection的时候,将Join :has_many
的这两张表,并且增加Join条件:
constraint = table[key].eq(foreign_table[foreign_key])
if reflection.type
type = chain[i + 1].klass.base_class.name
constraint = constraint.and(table[reflection.type].eq(type))
end
scope = scope.joins(join(foreign_table, constraint))
unless conditions.empty?
scope = scope.where(sanitize(conditions, table))
end
这里调用了大量Arel库的API,我们仅凭方法名即可理解其作用。
而在处理:has_and_belongs_to_many
的reflection的过程中还要额外执行这些代码:
join_table = tables.shift
scope = scope.joins(join(
join_table,
table[reflection.association_primary_key].
eq(join_table[reflection.association_foreign_key])
))
table, foreign_table = join_table, tables.first
以处理当前表和Join表之前的Join语句。然后,作为chain中的最后一个reflection,它执行的代码是:
scope = scope.where(table[key].eq(owner[foreign_key]))
if reflection.type
scope = scope.where(table[reflection.type].eq(owner.class.base_class.name))
end
conditions.each do |condition|
if options[:through] && condition.is_a?(Hash)
condition = disambiguate_condition(table, condition)
end
scope = scope.where(interpolate(condition))
end
这里无需再次Join了,因为已经知道了中间表中Join用的主键,直接作为条件写入SQL,可以更好的优化。
最后,获取到了合并之后的scope,与之前的scope进行合并(事实上只使用了之前的target_scope
的from和select语句,所有Join和where语句均有后者association_scope
提供),这样就可以得到完整的查询语句了。
作为Active Record Relation的尾声,这里将讨论剩下一些有趣的特性,大家先看实例代码吧:
class Picture < ActiveRecord::Base
belongs_to :imageable, :polymorphic => true
attr_accessible :name
default_scope order('created_at desc')
scope :of_employees, where(:imageable_type => 'Employee')
scope :of_products, -> { where(:imageable_type => 'Product') }
end
class Employee < ActiveRecord::Base
attr_accessible :name
has_many :pictures, :as => :imageable
end
class Product < ActiveRecord::Base
attr_accessible :name
has_many :pictures, :as => :imageable
end
employee.pictures.build :name => 'my avatar.png'
employee.pictures.create :name => 'my avatar 2.png'
Picture.of_employees
Picture.of_products
虽然代码看上去略长,但其实只涉及三种特性,分别是Build Association, Polymorphic Associations以及Scope Querying。其中Build Association和Scope Querying用了两种相似的写法,我们将区分他们之间的不同。
首先,我们将从:has_many
关系的build
方法开始。首先,该方法是@association
的代理方法,代理声明定义在ActiveRecord::Associations::CollectionProxy
中:
delegate :select, :find, :first, :last,
:build, :create, :create!,
:concat, :replace, :delete_all, :destroy_all, :delete, :destroy, :uniq,
:sum, :count, :size, :length, :empty?,
:any?, :many?, :include?,
:to => :@association
因此build
方法的实现在ActiveRecord::Associations::CollectionAssociation
中:
def build(attributes = {}, options = {}, &block)
if attributes.is_a?(Array)
attributes.collect { |attr| build(attr, options, &block) }
else
add_to_target(build_record(attributes, options)) do |record|
yield(record) if block_given?
end
end
end
可以看到,主要分两个步骤,build_record
和add_to_target
,其中build_record
实现在其基类ActiveRecord::Associations::Association
中:
def build_record(attributes, options)
reflection.build_association(attributes, options) do |record|
skip_assign = [reflection.foreign_key, reflection.type].compact
attributes = create_scope.except(*(record.changed - skip_assign))
record.assign_attributes(attributes, :without_protection => true)
end
end
该方法调用Reflection
对象的build_association
方法来创建其关系的对象:
def build_association(*options, &block)
klass.new(*options, &block)
end
可以看到,该方法仅仅是简单的创建了Active Record对象,并且为其赋值,创建时传入的block将在initialize
Callback前执行:
skip_assign = [reflection.foreign_key, reflection.type].compact
attributes = create_scope.except(*(record.changed - skip_assign))
record.assign_attributes(attributes, :without_protection => true)
这里可以看到,对于所有修改过的属性,只有reflection
的foreign_key
和type
在赋值列表里,其中type
并非STI中的Type column,而是Polymorphic Associations的字段:
def type
@type ||= options[:as] && "#{options[:as]}_type"
end
而foreign_key
的实现之前已经解释,它的默认值derive_foreign_key
也同样具有对Polymorphic Associations的支持。
create_scope
将先创建建立Polymorphic Associations必要的Arel结构,然后取出其中的where子句转换成Hash:
def create_scope
scoped.scope_for_create.stringify_keys
end
这里的scoped
的where子句又之前提到过的add_constraints
实现,该方法一样支持Polymorphic Associations条件的生成。
然后调用它的scope_for_create
方法:
def scope_for_create
@scope_for_create ||= where_values_hash.merge(create_with_value)
end
这里的where_values_hash
将返回Arel结构中where子句的Hash版本:
def where_values_hash
equalities = with_default_scope.where_values.grep(Arel::Nodes::Equality).find_all { |node|
node.left.relation.name == table_name
}
Hash[equalities.map { |where| [where.left.name, where.right] }].with_indifferent_access
end
实现并不困难,而且可以看出,这个方法也只适用于某个字段等于某个值的条件,不过对于Polymorphic Associations而言已经足够了。
回到build_record
方法,事实上,之所以需要从record
中所有被修改过的属性中只保留外键和Polymorphic Associations,是因为其他值在之前创建这个对象实例的时候就已经调用过assign_attributes
赋值了。这里将对刚才生成的Hash再次调用assign_attributes
赋值,添加:without_protection
参数是为了防止该属性因在黑名单中或是不在白名单中而赋值失效,该功能将在下文中详细解释。
接着,将该新记录插入到它所属的表中,不过只是在内存中而已,真正的插入必须调用save
保存才行。这个操作将由add_to_target
方法实现:
def add_to_target(record)
callback(:before_add, record)
yield(record) if block_given?
if options[:uniq] && index = @target.index(record)
@target[index] = record
else
@target << record
end
callback(:after_add, record)
set_inverse_instance(record)
record
end
这里将在先后各调用before和after的callback方法返回定义在类上的所有callback方法列表,然后根据不同类型采用不同的方法予以调用。
def callback(method, record)
callbacks_for(method).each do |callback|
case callback
when Symbol
owner.send(callback, record)
when Proc
callback.call(owner, record)
else
callback.send(method, owner, record)
end
end
end
def callbacks_for(callback_name)
full_callback_name = "#{callback_name}_for_#{reflection.name}"
owner.class.send(full_callback_name.to_sym) || []
end
至于在内存中插入新记录的方法很简单,就是向@target
插入记录,如果指定了:uniq
选项则需要先试图查找是否已经存在这个元素,如果不存在再进行插入,如果存在则只是进行替换(虽然id一致,但是数据可能不是最新的,因此只需要在原位置替换即可)。
最后,将返回新创建的记录,build
方法结束。
与build
相似,create
的实现如下:
def create(attributes = {}, options = {}, &block)
create_record(attributes, options, &block)
end
def create_record(attributes, options, raise = false, &block)
unless owner.persisted?
raise ActiveRecord::RecordNotSaved, "You cannot call create unless the parent is saved"
end
if attributes.is_a?(Array)
attributes.collect { |attr| create_record(attr, options, raise, &block) }
else
transaction do
add_to_target(build_record(attributes, options)) do |record|
yield(record) if block_given?
insert_record(record, true, raise)
end
end
end
end
可以看到create_record
与build
方法结构相似,唯一的区别就是:
create_record
方法中有transaction
方法创建Transaction来包裹代码块。add_to_target
方法的block中存在insert_record
方法。
transaction
方法会在下文中详细解析,这里我们只关注insert_record
方法,该方法定义在ActiveRecord::Associations::HasManyAssociation
中:
def insert_record(record, validate = true, raise = false)
set_owner_attributes(record)
if raise
record.save!(:validate => validate)
else
record.save(:validate => validate)
end
end
这里调用set_own_attributes
再次做赋值:
# Sets the owner attributes on the given record
def set_owner_attributes(record)
creation_attributes.each { |key, value| record[key] = value }
end
def creation_attributes
attributes = {}
if reflection.macro.in?([:has_one, :has_many]) && !options[:through]
attributes[reflection.foreign_key] = owner[reflection.active_record_primary_key]
if reflection.options[:as]
attributes[reflection.type] = owner.class.base_class.name
end
end
attributes
end
目前不清楚这里为何再次做了赋值,但是这里明显并不需要。接着,将视参数而定调用save
或save!
,成功后,将依旧返回新创建的对象。
接着,让我们来看看Scope Querying,首先,先从声明开始。Active Record对象关于Scope的声明有两种方法,直接写查询条件,或是将条件写在一个lambda里。其中后者每次执行scope都将执行一次,而前者只需要在一开始执行一次即可。从效率上看显然前者更好,但前者也因此具备了可以在调用时传参数的特性。值得一提的是,从Rails 4开始正式取消了前者写法,原因在这里解释。
但是由于本文讨论的是Rails 3.2,因此两种定义Scope的写法都将解析,首先是第一种方法的声明部分,实例代码是scope :of_employees, where(:imageable_type => 'Employee')
,在ActiveRecord::Base
的类中直接执行where
方法将被代理到scoped
中,该声明定义在ActiveRecord::Querying
中:
delegate :select, :group, :order, :except, :reorder, :limit, :offset, :joins,
:where, :preload, :eager_load, :includes, :from, :lock, :readonly,
:having, :create_with, :uniq, :to => :scoped
因此在这里执行where
方法的结果是将返回基于scoped
的Relation
对象。
然后我们进入scope
方法,该方法定义的位置在ActiveRecord
:
def scope(name, scope_options = {})
name = name.to_sym
valid_scope_name?(name)
extension = Module.new(&Proc.new) if block_given?
scope_proc = lambda do |*args|
options = scope_options.respond_to?(:call) ? unscoped { scope_options.call(*args) } : scope_options
options = scoped.apply_finder_options(options) if options.is_a?(Hash)
relation = scoped.merge(options)
extension ? relation.extending(extension) : relation
end
singleton_class.send(:redefine_method, name, &scope_proc)
end
可以看到,这个方法的特点就是它同时支持了传入Relation
对象和Proc
对象。只要名字符合valid_scope_name?
方法的需求,事实上valid_scope_name?
并非强制阻止,它的源码是:
def valid_scope_name?(name)
if logger && respond_to?(name, true)
logger.warn "Creating scope :#{name}. " \
"Overwriting existing method #{self.name}.#{name}."
end
end
可以看到仅仅是简单的警告而已。随后,将创建用于方法执行的lambda,并且定义这个方法到类方法中。
对于传入的参数是Relation
对象的情况,将直接将传入的Relation
对象与scoped
merge。注意,如果此时该类存在default_scope
,则仅仅在此时default_scope
才与Relation
对象merge,这就是保证了不会因为事先声明了default_scope
就导致以后声明的Scope都包含了default_scope
的内容造成一些不该存在的bug。
接着我们看下传入lambda的情况,事例代码是scope :of_products, -> { where(:imageable_type => 'Product') }
,方法依旧是scope
,其主要差异是在之前调用了unscoped
方法并且传入block,虽然unscoped
方法我们已经看到过,但只是简单的一笔带过,传入block的情况之前并没有提到过,这里将详细解析:
def unscoped
block_given? ? relation.scoping { yield } : relation
end
可以看到,这里首先重新创建了relation对象(这步其实已经得到了unscope
过的Relation
对象),然后调用了它的scoping
方法:
def scoping
@klass.with_scope(self, :overwrite) { yield }
end
该方法主要是接受一个block,并且用上下文的scope覆盖掉(由于传入了:override
参数)原来的scope,然后调用block:
def with_scope(scope = {}, action = :merge, &block)
# If another Active Record class has been passed in, get its current scope
scope = scope.current_scope if !scope.is_a?(Relation) && scope.respond_to?(:current_scope)
previous_scope = self.current_scope
if scope.is_a?(Hash)
# Dup first and second level of hash (method and params).
scope = scope.dup
scope.each do |method, params|
scope[method] = params.dup unless params == true
end
scope.assert_valid_keys([ :find, :create ])
relation = construct_finder_arel(scope[:find] || {})
relation.default_scoped = true unless action == :overwrite
if previous_scope && previous_scope.create_with_value && scope[:create]
scope_for_create = if action == :merge
previous_scope.create_with_value.merge(scope[:create])
else
scope[:create]
end
relation = relation.create_with(scope_for_create)
else
scope_for_create = scope[:create]
scope_for_create ||= previous_scope.create_with_value if previous_scope
relation = relation.create_with(scope_for_create) if scope_for_create
end
scope = relation
end
scope = previous_scope.merge(scope) if previous_scope && action == :merge
self.current_scope = scope
begin
yield
ensure
self.current_scope = previous_scope
end
end
可以看到该方法的第二个参数只有:merge
和:overwrite
两种选项,其中:merge
表示将暂时合并两种scope,而:overwrite
则完全使用传入的的scope。
因此,在实际执行lambda内查询语句的时候,当前scope正是刚刚新创建的Relation
对象,这就保证了预设的defalt_scope
不会影响到lambda的执行结果。至于具体的merge位置则与刚才介绍的不传入lambda的方法完全一致。
至于声明和获取default_scope
,也比较简单。defalt_scope
方法声明在ActiveRecord::Scoping::Default
模块中:
def default_scope(scope = {})
scope = Proc.new if block_given?
self.default_scopes = default_scopes + [scope]
end
可以看到default_scope
接受哈希,Relation
对象和block,default_scopes
是定义在同一模块下的数组,可以存储类的多个default_scope
声明。
至于获取default_scope
,就在先前介绍过的with_default_scope
方法中,它定义在ActiveRecord::Relation
中,源码是:
def with_default_scope #:nodoc:
if default_scoped? && default_scope = klass.send(:build_default_scope)
default_scope = default_scope.merge(self)
default_scope.default_scoped = false
default_scope
else
self
end
end
如果当前Scope是default scope的话,将尝试调用build_default_scope
看是否之前声明过default scope:
def build_default_scope #:nodoc:
if method(:default_scope).owner != ActiveRecord::Scoping::Default::ClassMethods
evaluate_default_scope { default_scope }
elsif default_scopes.any?
evaluate_default_scope do
default_scopes.inject(relation) do |default_scope, scope|
if scope.is_a?(Hash)
default_scope.apply_finder_options(scope)
elsif !scope.is_a?(Relation) && scope.respond_to?(:call)
default_scope.merge(scope.call)
else
default_scope.merge(scope)
end
end
end
end
end
从代码中可知,如果default_scope
的owner
不是Default
模块,则表明该方法已经被覆盖过,则直接调用该方法即可,否则如果之前声明过default scope,首先调用evaluate_default_scope
方法:
def evaluate_default_scope
return if ignore_default_scope?
begin
self.ignore_default_scope = true
yield
ensure
self.ignore_default_scope = false
end
end
该方法只是临时设置ignore_default_scope
为true(不过这个变量在Rails中并不具备具体的功能,可能是为了扩展所需,或仅仅是起到不重复调用该方法的目的)。
随后,创建一个全新的Relation
对象,将声明过的default scope一一与之合并,如果default scope是Hash则调用apply_finder_options
,如果是block则先执行这个block再调用merge
,如果是Relation
对象就直接调用merge
。最后将得到最终的default scope结果。
从本篇开始将不再讨论Active Record最基本的查询和关系功能,本篇将简单解析下Active Record对象的属性初始化,读取,写入,STI和序列化这样的简单功能,案例如下:
class User < ActiveRecord::Base
attr_accessible :contact, :type, :username
serialize :contact
end
class Student < User; end
s.username = 'bachue'
p s.username
s.contact = {:phone => '123456', :city => 'Shanghai', :address => 'NanJing RD'}
p s.contact
首先,让我们先从serialize
方法的声明开始,在Active Record中,与序列化相关的部分一般定义在ActiveRecord::AttributeMethods::Serialization
模块中:
def serialize(attr_name, class_name = Object)
coder = if [:load, :dump].all? { |x| class_name.respond_to?(x) }
class_name
else
Coders::YAMLColumn.new(class_name)
end
# merge new serialized attribute and create new hash to ensure that each class in inheritance hierarchy
# has its own hash of own serialized attributes
self.serialized_attributes = serialized_attributes.merge(attr_name.to_s => coder)
end
serialize
的第二个参数表示序列化的方法,接受一个实现了:load
和:dump
方法的类。如果传入的类没有符合这个要求,或是没有参数参数,则默认使用Coders::YAMLColumn
类的实例,该类将调用YAML库来序列化数据。最后,将属性名和序列化类放入serialized_attributess
中。
接着,让我们解析下Active Record对象是如何初始化属性的。事实上,Active Record初始化属性有多个可能的入口:respond_to?
,read_attribute
或者write_attribute
,method_missing
。但总之入口方法始终都是define_attribute_methods
,该方法定义在ActiveRecord::AttributeMethods
:
def define_attribute_methods
unless defined?(@attribute_methods_mutex)
msg = "It looks like something (probably a gem/plugin) is overriding the " \
"ActiveRecord::Base.inherited method. It is important that this hook executes so " \
"that your models are set up correctly. A workaround has been added to stop this " \
"causing an error in 3.2, but future versions will simply not work if the hook is " \
"overridden. If you are using Kaminari, please upgrade as it is known to have had " \
"this problem.\n\n"
msg << "The following may help track down the problem:"
meth = method(:inherited)
if meth.respond_to?(:source_location)
msg << " #{meth.source_location.inspect}"
else
msg << " #{meth.inspect}"
end
msg << "\n\n"
ActiveSupport::Deprecation.warn(msg)
@attribute_methods_mutex = Mutex.new
end
# Use a mutex; we don't want two thread simaltaneously trying to define
# attribute methods.
@attribute_methods_mutex.synchronize do
return if attribute_methods_generated?
superclass.define_attribute_methods unless self == base_class
super(column_names)
column_names.each { |name| define_external_attribute_method(name) }
@attribute_methods_generated = true
end
end
首先,添加属性方法的时候需要加锁,防止线程安全问题。如果当前类有父类,则调用父类同名方法来定义属性方法。随后就调用ActiveModel::AttributeMethods
中的同名方法并传入所有Column的名字作为参数:
def define_attribute_methods(attr_names)
attr_names.each { |attr_name| define_attribute_method(attr_name) }
end
可以看到,这里对每个属性名字都调用了define_attribute_method
方法:
def define_attribute_method(attr_name)
attribute_method_matchers.each do |matcher|
method_name = matcher.method_name(attr_name)
unless instance_method_already_implemented?(method_name)
generate_method = "define_method_#{matcher.method_missing_target}"
if respond_to?(generate_method, true)
send(generate_method, attr_name)
else
define_optimized_call generated_attribute_methods, method_name, matcher.method_missing_target, attr_name.to_s
end
end
end
attribute_method_matchers_cache.clear
end
首先,这里遍历了attribute_method_matchers
数组,该数组在Model每次调用attribute_method_prefix
,attribute_method_suffix
,attribute_method_affix
的时候均会添加一个元素,这个元素维护一个正则表达式和一个Format,即可根据方法名找属性又可以根据属性找方法名。这里调用了matcher
的method_name
方法来得到方法名,随后调用instance_method_already_implemented?
确定该方法是否已经实现,该实现被ActiveRecord::AttributeMethods
覆盖过:
def instance_method_already_implemented?(method_name)
if dangerous_attribute_method?(method_name)
raise DangerousAttributeError, "#{method_name} is defined by ActiveRecord"
end
if superclass == Base
super
else
# If B < A and A defines its own attribute method, then we don't want to overwrite that.
defined = method_defined_within?(method_name, superclass, superclass.generated_attribute_methods)
defined && !ActiveRecord::Base.method_defined?(method_name) || super
end
end
首先判断该属性是否是id
或是与ActiveRecord::Base
中某个方法重名(但如果是与Object
中某个方法重名却是允许的),然后,如果当前Model类父类就是ActiveRecord::Base
,将直接调用父类方法:
def instance_method_already_implemented?(method_name)
generated_attribute_methods.method_defined?(method_name)
end
而父类实现就是判断该方法是否定义在generated_attribute_methods
这个模块中。
随后我们回到define_attribute_method
方法,假设该方法还没有被定义,则按照约定生成一个可以定义该方法的方法名,然后查看是否已经定义了这个方法,如果没有定义,则调用define_optimized_call
来生成这个方法的内容:
# Define a method `name` in `mod` that dispatches to `send`
# using the given `extra` args. This fallbacks `define_method`
# and `send` if the given names cannot be compiled.
def define_optimized_call(mod, name, send, *extra)
if name =~ NAME_COMPILABLE_REGEXP
defn = "def #{name}(*args)"
else
defn = "define_method(:'#{name}') do |*args|"
end
extra = (extra.map(&:inspect) << "*args").join(", ")
if send =~ CALL_COMPILABLE_REGEXP
target = "#{send}(#{extra})"
else
target = "send(:'#{send}', #{extra})"
end
mod.module_eval <<-RUBY, __FILE__, __LINE__ + 1
#{defn}
#{target}
end
RUBY
end
按照约定,这个方法将调用一个替代方法,这个方法名是将原方法名中属性名的部分替换成attribute
以后的结果,并将属性名作为一个参数传入,例如定义name=
方法的内容为attribute.=(name)
,如果已经定义,则调用该方法。用作定义读方法的define_method_attribute
和用作写方法的define_method_attribute=
就是在这个时候被调用的,过会将详细解释这两个方法的实现。
最后,将清理attribute_method_matchers_cache
的内容,这个缓存一般用于根据方法名用正则表达式在attribute_method_matchers
中搜索matcher
的时候保存搜索结果,每次更新属性方法都将造成缓存的失效。
随后,将对每个属性名调用define_external_attribute_method
方法:
def define_external_attribute_method(attr_name)
generated_external_attribute_methods.module_eval <<-STR, __FILE__, __LINE__ + 1
def __temp__(v, attributes, attributes_cache, attr_name)
#{external_attribute_access_code(attr_name, attribute_cast_code(attr_name))}
end
alias_method '#{attr_name}', :__temp__
undef_method :__temp__
STR
end
可以看到这里定义方法的手段略奇怪,原因在注释中已经写明,define_method
由于要创建闭包可能效率偏低并且占用更多内存,但传统的def
语法可能无法创建一些名字不符合Ruby规范的方法,因此采用先创建__temp__
方法再做alias的手法解决这个问题。该方法的内容主要是为generated_external_attribute_methods
增加了与属性名同名的方法,而这个方法的内容则是external_attribute_access_code(attr_name, attribute_cast_code(attr_name))
的结果,其中attribute_cast_code
的实现如下,该实现有多层,第一层实现在ActiveRecord::AttributeMethods::Serialization
中:
def attribute_cast_code(attr_name)
if serialized_attributes.include?(attr_name)
"v.unserialized_value"
else
super
end
end
如果该属性是被序列化过的话,则调用其unserialized_value
方法(对数据进行反序列化,过会将会详细解释这个方法),否则调用父类方法,该层实现定义在ActiveRecord::AttributeMethods::TimeZoneConversion
中:
# The enhanced read method automatically converts the UTC time stored in the database to the time
# zone stored in Time.zone.
def attribute_cast_code(attr_name)
column = columns_hash[attr_name]
if create_time_zone_conversion_attribute?(attr_name, column)
typecast = "v = #{super}"
time_zone_conversion = "v.acts_like?(:time) ? v.in_time_zone : v"
"((#{typecast}) && (#{time_zone_conversion}))"
else
super
end
end
其中create_time_zone_conversion_attribute
的实现是:
def create_time_zone_conversion_attribute?(name, column)
time_zone_aware_attributes && !self.skip_time_zone_conversion_for_attributes.include?(name.to_sym) && column.type.in?([:datetime, :timestamp])
end
由于Rails中可以维护当前时区,如果当前Column的类型是时间,则对于从数据库中得到的时间将会按照Rails中设定的时区进行转换,而转换方法则是ActiveSupport::TimeWithZone
定义的in_time_zone
。
如果不是,将继续调用上层方法,这层实现定义在ActiveRecord::AttributeMethods::Read
中:
def attribute_cast_code(attr_name)
columns_hash[attr_name].type_cast_code('v')
end
由于一般通过数据库Adapter获取的数据通常都是字符串类型,在Rails中则根据Column类型应该转换成相应的Ruby的类型,这个方法就是ActiveRecord::ConnectionAdapters::Column
实现的type_cast_code
方法:
def type_cast_code(var_name)
klass = self.class.name
case type
when :string, :text then var_name
when :integer then "#{klass}.value_to_integer(#{var_name})"
when :float then "#{var_name}.to_f"
when :decimal then "#{klass}.value_to_decimal(#{var_name})"
when :datetime, :timestamp then "#{klass}.string_to_time(#{var_name})"
when :time then "#{klass}.string_to_dummy_time(#{var_name})"
when :date then "#{klass}.string_to_date(#{var_name})"
when :binary then "#{klass}.binary_to_string(#{var_name})"
when :boolean then "#{klass}.value_to_boolean(#{var_name})"
else var_name
end
end
而紧接着调用external_attribute_access_code
的实现:
def external_attribute_access_code(attr_name, cast_code)
access_code = "v && #{cast_code}"
if cache_attribute?(attr_name)
access_code = "attributes_cache[attr_name] ||= (#{access_code})"
end
access_code
end
可以看到,这里还会有缓存功能的实现,这是ActiveRecord::AttributeMethods::Read
自身定义的功能,当Column类型是[:datetime, :timestamp, :time, :date]
中某一个的时候,其值将会被缓存。随后将返回所有生成的字符串作为代码。
最后,回到define_attribute_methods
,将会将@attribute_methods_generated
置为true,则所有属性方法生成完毕。
下面将解析define_method_attribute
和define_method_attribute=
两个方法,以便于等会对于属性读写的解析。首先是define_method_attribute
,一样分多个层次实现,第一层实现在ActiveRecord::AttributeMethods::PrimaryKey
:
def define_method_attribute(attr_name)
super
if attr_name == primary_key && attr_name != 'id'
generated_attribute_methods.send(:alias_method, :id, primary_key)
generated_external_attribute_methods.module_eval <<-CODE, __FILE__, __LINE__
def id(v, attributes, attributes_cache, attr_name)
attr_name = '#{primary_key}'
send(attr_name, attributes[attr_name], attributes, attributes_cache, attr_name)
end
CODE
end
end
该层的实现是,当主键名字不为id
的时候,依然创建id
方法并alias到对应的主键方法去。然后进入上一层ActiveRecord::AttributeMethods::Read
的实现:
def define_method_attribute(attr_name)
generated_attribute_methods.module_eval <<-STR, __FILE__, __LINE__ + 1
def __temp__
#{internal_attribute_access_code(attr_name, attribute_cast_code(attr_name))}
end
alias_method '#{attr_name}', :__temp__
undef_method :__temp__
STR
end
可以看到这个实现与define_external_attribute_method
的实现非常相似,区别主要是在于后者将方法定义在了generated_external_attribute_methods
这个模块上,__temp__
接受外部传入的参数而不是依靠instance variable,生成代码的方法也用到了external_attribute_access_code
方法。
下面我们来看看internal_attribute_access_code
里是如何生成代码的:
def internal_attribute_access_code(attr_name, cast_code)
access_code = "(v=@attributes[attr_name]) && #{cast_code}"
unless attr_name == primary_key
access_code.insert(0, "missing_attribute(attr_name, caller) unless @attributes.has_key?(attr_name); ")
end
if cache_attribute?(attr_name)
access_code = "@attributes_cache[attr_name] ||= (#{access_code})"
end
"attr_name = '#{attr_name}'; #{access_code}"
end
可以看到,生成的代码也很简单,与external_attribute_access_code
相似,只是改成从@attributes
中读取数据,然后进行类型转换。如果找不到该属性的话,则调用miss_attribute
方法,该方法将抛出ActiveModel::MissingAttributeError
异常。
随后我们来看define_method_attribute=
方法,该方法同样有多层实现,第一层实现定义在ActiveRecord::AttributeMethods::TimeZoneConversion
:
# Defined for all +datetime+ and +timestamp+ attributes when +time_zone_aware_attributes+ are enabled.
# This enhanced write method will automatically convert the time passed to it to the zone stored in Time.zone.
def define_method_attribute=(attr_name)
if create_time_zone_conversion_attribute?(attr_name, columns_hash[attr_name])
method_body, line = <<-EOV, __LINE__ + 1
def #{attr_name}=(original_time)
original_time = nil if original_time.blank?
time = original_time
unless time.acts_like?(:time)
time = time.is_a?(String) ? Time.zone.parse(time) : time.to_time rescue time
end
time = time.in_time_zone rescue nil if time
previous_time = attribute_changed?("#{attr_name}") ? changed_attributes["#{attr_name}"] : read_attribute(:#{attr_name})
write_attribute(:#{attr_name}, original_time)
#{attr_name}_will_change! if previous_time != time
@attributes_cache["#{attr_name}"] = time
end
EOV
generated_attribute_methods.module_eval(method_body, __FILE__, line)
else
super
end
end
与之前提及的一致,该方法接受外部传入的时间,按照Rails的设定进行时区转换,然后将结果通过write_attribute
写入,随后强制将日期标记为changed(原因在这里)。这里的read_attribute
和write_attribute
方法将在过会详细解析。随后,进入下一层ActiveRecord::AttributeMethods::Write
中的实现:
def define_method_attribute=(attr_name)
if attr_name =~ ActiveModel::AttributeMethods::NAME_COMPILABLE_REGEXP
generated_attribute_methods.module_eval("def #{attr_name}=(new_value); write_attribute('#{attr_name}', new_value); end", __FILE__, __LINE__)
else
generated_attribute_methods.send(:define_method, "#{attr_name}=") do |new_value|
write_attribute(attr_name, new_value)
end
end
end
可以看到,方法一样定义在generated_attribute_methods
上,主要实现就是调用write_attribute
方法来写入新值。
我们也由此可以发现一个细节,从ActiveRecord
中读取属性的时候可以不使用read_attribute
方法,直接从@attributes
中读取然后转换即可。但是写入属性则必须通过write_attribute
来完成。事实上,调用read_attribute
额外获得的好处仅仅只是如果属性方法在那时还没有定义,则定义他们而已,我们来看下read_attribute
的源码:
# Returns the value of the attribute identified by <tt>attr_name</tt> after it has been typecast (for example,
# "2004-12-12" in a data column is cast to a date object, like Date.new(2004, 12, 12)).
def read_attribute(attr_name)
self.class.type_cast_attribute(attr_name, @attributes, @attributes_cache)
end
def type_cast_attribute(attr_name, attributes, cache = {})
return unless attr_name
attr_name = attr_name.to_s
if generated_external_attribute_methods.method_defined?(attr_name)
if attributes.has_key?(attr_name) || attr_name == 'id'
generated_external_attribute_methods.send(attr_name, attributes[attr_name], attributes, cache, attr_name)
end
elsif !attribute_methods_generated?
# If we haven't generated the caster methods yet, do that and
# then try again
define_attribute_methods
type_cast_attribute(attr_name, attributes, cache)
else
# If we get here, the attribute has no associated DB column, so
# just return it verbatim.
attributes[attr_name]
end
end
可以看到,read_attribute
实际上依赖了generated_external_attribute_methods
里的方法来实现读取,并将自身的@attributes
和@attributes_cache
传入,这个做法有可能只是因为希望实现代码共享。
好,在进行对write_attribute
的解析前,不如顺便先解析STI。事实上,与STI相关的代码存在与Active Record的方方面面,不能真正集中解析,而且之前的代码也多有提及,这里我们只是集中在STI的初始化和赋值部分上。
在ActiveRecord::Base
的initialize
方法的实现中,有一个方法叫ensure_proper_type
:
def initialize(attributes = nil, options = {})
defaults = Hash[self.class.column_defaults.map { |k, v| [k, v.duplicable? ? v.dup : v] }]
@attributes = self.class.initialize_attributes(defaults)
@association_cache = {}
@aggregation_cache = {}
@attributes_cache = {}
@new_record = true
@readonly = false
@destroyed = false
@marked_for_destruction = false
@previously_changed = {}
@changed_attributes = {}
ensure_proper_type
populate_with_current_scope_attributes
assign_attributes(attributes, options) if attributes
yield self if block_given?
run_callbacks :initialize
end
这个方法的实现在ActiveRecord::Inheritance
:
# Sets the attribute used for single table inheritance to this class name if this is not the
# ActiveRecord::Base descendant.
# Considering the hierarchy Reply < Message < ActiveRecord::Base, this makes it possible to
# do Reply.new without having to set <tt>Reply[Reply.inheritance_column] = "Reply"</tt> yourself.
# No such attribute would be set for objects of the Message class in that example.
def ensure_proper_type
klass = self.class
if klass.finder_needs_type_condition?
write_attribute(klass.inheritance_column, klass.sti_name)
end
end
klass.finder_needs_type_condition
的源码之前已经介绍过,只是简单检查Column中是否存在inheritance_column
。随后,通过write_attribute
方法,向这个Column写入klass.sti_name
,sti_name
的实现如下:
def sti_name
store_full_sti_class ? name : name.demodulize
end
一般store_full_sti_class
默认为true,因此将类完整的名字写入inheritance_column
,这里我们将解析write_attribute
方法的实现,该方法也有多层实现,其中第一层是ActiveRecord::AttributeMethods::Dirty
中的实现:
# Wrap write_attribute to remember original attribute value.
def write_attribute(attr, value)
attr = attr.to_s
# The attribute already has an unsaved change.
if attribute_changed?(attr)
old = @changed_attributes[attr]
@changed_attributes.delete(attr) unless _field_changed?(attr, old, value)
else
old = clone_attribute_value(:read_attribute, attr)
# Save Time objects as TimeWithZone if time_zone_aware_attributes == true
old = old.in_time_zone if clone_with_time_zone_conversion_attribute?(attr, old)
@changed_attributes[attr] = old if _field_changed?(attr, old, value)
end
# Carry on.
super(attr, value)
end
可以看到,如果之前已经修改过这个属性,并且修改之前的属性的值与新写入的值一致,就像还原可一样,write_attribute
就会删除@changed_attributes
中的设置。如果之前没有修改过,将调用clone_attribute_value
调用read_attribute
取出该属性的副本:
def clone_attribute_value(reader_method, attribute_name)
value = send(reader_method, attribute_name)
value.duplicable? ? value.clone : value
rescue TypeError, NoMethodError
value
end
随后,检查是否需要时区转换,方法是clone_with_time_zone_conversion_attribute?
:
def clone_with_time_zone_conversion_attribute?(attr, old)
old.class.name == "Time" && time_zone_aware_attributes && !self.skip_time_zone_conversion_for_attributes.include?(attr.to_sym)
end
然后,将旧值赋值到@changed_attributes
中,接着就可以进入定义在ActiveRecord::AttributeMethods::Write
中的上层方法:
# Updates the attribute identified by <tt>attr_name</tt> with the specified +value+. Empty strings
# for fixnum and float columns are turned into +nil+.
def write_attribute(attr_name, value)
attr_name = attr_name.to_s
attr_name = self.class.primary_key if attr_name == 'id' && self.class.primary_key
@attributes_cache.delete(attr_name)
column = column_for_attribute(attr_name)
unless column || @attributes.has_key?(attr_name)
ActiveSupport::Deprecation.warn(
"You're trying to create an attribute `#{attr_name}'. Writing arbitrary " \
"attributes on a model is deprecated. Please just use `attr_writer` etc."
)
end
@attributes[attr_name] = type_cast_attribute_for_write(column, value)
end
可以看到,如果该对象的主键并非id
,为id
赋值也会实际上等同于为主键赋值。赋值的实际内容就是将经过type_cast_attribute_for_write
转换过的值赋值给@attributes
,该方法实现如下:
def type_cast_attribute_for_write(column, value)
if column && coder = self.class.serialized_attributes[column.name]
Attribute.new(coder, value, :unserialized)
else
super
end
end
这层是ActiveRecord::AttributeMethods::Serialization
里的实现,当当前Column是被序列化的话,将创建Attribute
的实例用以赋值,该类主要维护三个属性,数据,当前序列化状态以及编码器:
class Attribute < Struct.new(:coder, :value, :state)
def unserialized_value
state == :serialized ? unserialize : value
end
def serialized_value
state == :unserialized ? serialize : value
end
def unserialize
self.state = :unserialized
self.value = coder.load(value)
end
def serialize
self.state = :serialized
self.value = coder.dump(value)
end
end
可以看到该类的实现相当灵活同时效率也相当不错。如果该Column并非序列化的话,则继续进入上层ActiveRecord::AttributeMethods::Write
的实现:
def type_cast_attribute_for_write(column, value)
if column && column.number?
convert_number_column_value(value)
else
value
end
end
这里的column.number?
是指当Column类型是否是任何一类数字,包括Integer,Float,Decimal之类的。如果是数字的话,则需要调用convert_number_column_value
转换:
def convert_number_column_value(value)
case value
when FalseClass
0
when TrueClass
1
when String
value.presence
else
value
end
end
这里本篇最后一个用例,内容较丰富,将解析Active Record其他一些细小零碎的功能。
class User < ActiveRecord::Base
attr_accessible :email, :location, :login, :zip
validates :login, :email, presence: true
validates_format_of :email, :with => /\A([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i
before_validation :ensure_login_has_a_value
before_save :set_location, if: 'zip.present?'
protected
def ensure_login_has_a_value
if login.nil?
self.login = email unless email.blank?
end
end
def set_location
self.location = LocationService.query self
end
end
User.transaction do
user.save
end
首先我们分别解析attr_accessible
和attr_protected
的实现吧,这两个方法都定义在ActiveModel::MassAssignmentSecurity
中,位置在activemodel-3.2.13/lib/active_model/mass_assignment_security.rb
。
def attr_accessible(*args)
options = args.extract_options!
role = options[:as] || :default
self._accessible_attributes = accessible_attributes_configs.dup
Array.wrap(role).each do |name|
self._accessible_attributes[name] = self.accessible_attributes(name) + args
end
self._active_authorizer = self._accessible_attributes
end
在ActiveModel
中,_accessible_attributes
是白名单的规则,可以通过accessible_attributes_configs
的实现发现:
def accessible_attributes_configs
self._accessible_attributes ||= begin
Hash.new { |h,k| h[k] = WhiteList.new }
end
end
每个对象都可以有多份_accessible_attributes
,由一个Hash来管理,默认情况下,我们总是使用:default
作为key。在完成了对_accessible_attributes
的赋值之后,将它赋值给_active_authorizer
属性,表示所有定义的白名单正式起效。
然后再看protected_attributes
的实现:
def attr_protected(*args)
options = args.extract_options!
role = options[:as] || :default
self._protected_attributes = protected_attributes_configs.dup
Array.wrap(role).each do |name|
self._protected_attributes[name] = self.protected_attributes(name) + args
end
self._active_authorizer = self._protected_attributes
end
attr_protected
的实现与attr_accessible
非常相似,只是用了黑名单规则而已,可以看protected_attributes_configs
的实现:
def protected_attributes_configs
self._protected_attributes ||= begin
Hash.new { |h,k| h[k] = BlackList.new(attributes_protected_by_default) }
end
end
这里的黑名单中可以默认添加一些属性,虽然在ActiveModel
的默认为空,但是在ActiveRecord
中则有一定的规则:
# The primary key and inheritance column can never be set by mass-assignment for security reasons.
def attributes_protected_by_default
default = [ primary_key, inheritance_column ]
default << 'id' unless primary_key.eql? 'id'
default
end
可以看到,默认情况下主键和STI的Column都是protected的。
最后同样将_protected_attributes
赋值到_active_authorizer
中,由此可以发现,attr_protected
和attr_accessible
无法相互兼容。
在初始化ActiveRecord
对象时,将会调用到assign_attributes
方法来负责属性的赋值:
def assign_attributes(new_attributes, options = {})
return if new_attributes.blank?
attributes = new_attributes.stringify_keys
multi_parameter_attributes = []
nested_parameter_attributes = []
@mass_assignment_options = options
unless options[:without_protection]
attributes = sanitize_for_mass_assignment(attributes, mass_assignment_role)
end
attributes.each do |k, v|
if k.include?("(")
multi_parameter_attributes << [ k, v ]
elsif respond_to?("#{k}=")
if v.is_a?(Hash)
nested_parameter_attributes << [ k, v ]
else
send("#{k}=", v)
end
else
raise(UnknownAttributeError, "unknown attribute: #{k}")
end
end
# assign any deferred nested attributes after the base attributes have been set
nested_parameter_attributes.each do |k,v|
send("#{k}=", v)
end
@mass_assignment_options = nil
assign_multiparameter_attributes(multi_parameter_attributes)
end
其中负责过滤属性的方法是sanitize_for_mass_assignment
:
def sanitize_for_mass_assignment(attributes, role = nil)
_mass_assignment_sanitizer.sanitize(attributes, mass_assignment_authorizer(role))
end
其中_mass_assignment_sanitizer.sanitize
有两种可能,一种是ActiveModel::MassAssignmentSecurity::StrictSanitizer
对象,一种是ActiveModel::MassAssignmentSecurity::LoggerSanitizer
对象。默认情况下,前者在development
和test
模式下用,后者在production
模式下使用。
mass_assignment_authorizer
从之前的active_authorizer
中取出role对应的白名单或黑名单对象:
def mass_assignment_authorizer(role)
self.class.active_authorizer[role || :default]
end
随后进入sanitize
方法:
# Returns all attributes not denied by the authorizer.
def sanitize(attributes, authorizer)
sanitized_attributes = attributes.reject { |key, value| authorizer.deny?(key) }
debug_protected_attribute_removal(attributes, sanitized_attributes)
sanitized_attributes
end
首先调用预设的authorizer
检查每一个传入的key,如果是白名单就去除所有不在白名单中的key,否则去除所有在黑名单中的key。然后调用debug_protected_attribute_removal
处理因此而被排除的key:
def debug_protected_attribute_removal(attributes, sanitized_attributes)
removed_keys = attributes.keys - sanitized_attributes.keys
process_removed_attributes(removed_keys) if removed_keys.any?
end
这里的process_removed_attributes
的实现将体现出StrictSanitizer
和LoggerSanitizer
的区别,前者将抛出ActiveModel::MassAssignmentSecurity::Error
错误,而后者将记录日志。
随后让我们来看下Validation的声明。事实上声明一个Validation有多种方法,我们将从最简单的一种方法validates :login, :email, presence: true
开始解析,该方法定义在ActiveModel::Validations
模块中:
def validates(*attributes)
defaults = attributes.extract_options!
validations = defaults.slice!(*_validates_default_keys)
raise ArgumentError, "You need to supply at least one attribute" if attributes.empty?
raise ArgumentError, "You need to supply at least one validation" if validations.empty?
defaults.merge!(:attributes => attributes)
validations.each do |key, options|
key = "#{key.to_s.camelize}Validator"
begin
validator = key.include?('::') ? key.constantize : const_get(key)
rescue NameError
raise ArgumentError, "Unknown validator: '#{key}'"
end
validates_with(validator, defaults.merge(_parse_validates_options(options)))
end
end
首先,该方法将传入的参数截成两半,Hash部分表示需要使用的Validator及其选项,_validates_default_keys
在这里的值为[:if, :unless, :on, :allow_blank, :allow_nil , :strict]
。而其余部分作为属性合并到传入validates_with
的选项中去。这里的_parse_validates_options
方法将一些非Hash的属性包装成Hash,以便与传入的选项合并:
def _parse_validates_options(options)
case options
when TrueClass
{}
when Hash
options
when Range, Array
{ :in => options }
else
{ :with => options }
end
end
随后我们进入validates_with
的实现:
def validates_with(*args, &block)
options = args.extract_options!
args.each do |klass|
validator = klass.new(options, &block)
validator.setup(self) if validator.respond_to?(:setup)
if validator.respond_to?(:attributes) && !validator.attributes.empty?
validator.attributes.each do |attribute|
_validators[attribute.to_sym] << validator
end
else
_validators[nil] << validator
end
validate(validator, options)
end
end
首先创建传入的Validator类的实例,Rails中大部分Validator都是EachValidator
的子类,EachValidator
的实现在activemodel-3.2.13/lib/active_model/validator.rb
中:
class EachValidator < Validator
attr_reader :attributes
# Returns a new validator instance. All options will be available via the
# +options+ reader, however the <tt>:attributes</tt> option will be removed
# and instead be made available through the +attributes+ reader.
def initialize(options)
@attributes = Array.wrap(options.delete(:attributes))
raise ":attributes cannot be blank" if @attributes.empty?
super
check_validity!
end
# Performs validation on the supplied record. By default this will call
# +validates_each+ to determine validity therefore subclasses should
# override +validates_each+ with validation logic.
def validate(record)
attributes.each do |attribute|
value = record.read_attribute_for_validation(attribute)
next if (value.nil? && options[:allow_nil]) || (value.blank? && options[:allow_blank])
validate_each(record, attribute, value)
end
end
end
可以看到它在初始化时接受属性,并在调用validate
做实际验证的时候,对每个属性都调用validate_each
方法。回到validates_with
方法,_validators
为每一个属性添加了对应的Validator,最后调用了validate
方法,该方法定义在ActiveModel::Validations
中:
def validate(*args, &block)
options = args.extract_options!
if options.key?(:on)
options = options.dup
options[:if] = Array.wrap(options[:if])
options[:if].unshift("validation_context == :#{options[:on]}")
end
args << options
set_callback(:validate, *args, &block)
end
可以看到,这里仅仅是简单的对:on
选项进行了处理,接着就设置了:validate
的callback,声明到此结束。
然后让我们看下Transaction的实现,该实现定义在ActiveRecord::Transactions
中:
def transaction(options = {}, &block)
# See the ConnectionAdapters::DatabaseStatements#transaction API docs.
connection.transaction(options, &block)
end
connection.transaction
的实现在ActiveRecord::ConnectionAdapters::DatabaseStatements
中,该方法主要是负责选项的处理,以及调用相应数据库驱动的方法以开启,commit和rollback Transaction的方法:
def transaction(options = {})
options.assert_valid_keys :requires_new, :joinable
last_transaction_joinable = defined?(@transaction_joinable) ? @transaction_joinable : nil
if options.has_key?(:joinable)
@transaction_joinable = options[:joinable]
else
@transaction_joinable = true
end
requires_new = options[:requires_new] || !last_transaction_joinable
transaction_open = false
@_current_transaction_records ||= []
begin
if block_given?
if requires_new || open_transactions == 0
if open_transactions == 0
begin_db_transaction
elsif requires_new
create_savepoint
end
increment_open_transactions
transaction_open = true
@_current_transaction_records.push([])
end
yield
end
rescue Exception => database_transaction_rollback
if transaction_open && !outside_transaction?
transaction_open = false
decrement_open_transactions
if open_transactions == 0
rollback_db_transaction
rollback_transaction_records(true)
else
rollback_to_savepoint
rollback_transaction_records(false)
end
end
raise unless database_transaction_rollback.is_a?(ActiveRecord::Rollback)
end
ensure
@transaction_joinable = last_transaction_joinable
if outside_transaction?
@open_transactions = 0
elsif transaction_open
decrement_open_transactions
begin
if open_transactions == 0
commit_db_transaction
commit_transaction_records
else
release_savepoint
save_point_records = @_current_transaction_records.pop
unless save_point_records.blank?
@_current_transaction_records.push([]) if @_current_transaction_records.empty?
@_current_transaction_records.last.concat(save_point_records)
end
end
rescue Exception => database_transaction_rollback
if open_transactions == 0
rollback_db_transaction
rollback_transaction_records(true)
else
rollback_to_savepoint
rollback_transaction_records(false)
end
raise
end
end
end
这里的begin_db_transaction
,create_savepoint
,rollback_db_transaction
,commit_db_transaction
,release_savepoint
,rollback_to_savepoint
由数据库驱动提供,用以创建/回滚/提交transaction或者save point,不过Rails为save point提供了默认的current_savepoint_name
方法以便数据库使用,位置在ActiveRecord::ConnectionAdapters::AbstractAdapter
(它同时也是所有数据库驱动的基类),这个方法的实现非常简单:
def current_savepoint_name
"active_record_#{open_transactions}"
end
而commit_transaction_records
和rollback_transaction_records
这两个方法依然由ActiveRecord
本身实现,我们先来看下commit_transaction_records
的实现:
# Send a commit message to all records after they have been committed.
def commit_transaction_records
records = @_current_transaction_records.flatten
@_current_transaction_records.clear
unless records.blank?
records.uniq.each do |record|
begin
record.committed!
rescue Exception => e
record.logger.error(e) if record.respond_to?(:logger) && record.logger
end
end
end
end
这里将取出当前Transaction中所有需要提交的ActiveRecord
对象,然后调用committed!
方法:
# Call the after_commit callbacks
def committed!
run_callbacks :commit
ensure
clear_transaction_record_state
end
这里主要是执行:commit
这个Callback。然后执行clear_transaction_record_state
方法:
# Clear the new record state and id of a record.
def clear_transaction_record_state
if defined?(@_start_transaction_state)
@_start_transaction_state[:level] = (@_start_transaction_state[:level] || 0) - 1
remove_instance_variable(:@_start_transaction_state) if @_start_transaction_state[:level] < 1
end
end
@_start_transaction_state
当第一次执行save或destroy方法的时候被创建,用以记录当前Transaction中对象的情况。然后这里主要是计算@_start_transaction_state
的level并在需要的时候删除掉这个对象。
这样,一个完整的Transaction的过程就完成了。
当然,我们也有可能调用到rollback_transaction_records
方法:
# Send a rollback message to all records after they have been rolled back. If rollback
# is false, only rollback records since the last save point.
def rollback_transaction_records(rollback)
if rollback
records = @_current_transaction_records.flatten
@_current_transaction_records.clear
else
records = @_current_transaction_records.pop
end
unless records.blank?
records.uniq.each do |record|
begin
record.rolledback!(rollback)
rescue Exception => e
record.logger.error(e) if record.respond_to?(:logger) && record.logger
end
end
end
end
这个是实现与commit_transaction_records
非常相似,主要是调用ActiveRecord
对象的rolledback!
方法:
def rolledback!(force_restore_state = false)
run_callbacks :rollback
ensure
IdentityMap.remove(self) if IdentityMap.enabled?
restore_transaction_record_state(force_restore_state)
end
然后让我们看下restore_transaction_record_state
方法的实现:
# Restore the new record state and id of a record that was previously saved by a call to save_record_state.
def restore_transaction_record_state(force = false)
if defined?(@_start_transaction_state)
@_start_transaction_state[:level] = (@_start_transaction_state[:level] || 0) - 1
if @_start_transaction_state[:level] < 1 || force
restore_state = remove_instance_variable(:@_start_transaction_state)
was_frozen = restore_state[:frozen?]
@attributes = @attributes.dup if @attributes.frozen?
@new_record = restore_state[:new_record]
@destroyed = restore_state[:destroyed]
if restore_state.has_key?(:id)
self.id = restore_state[:id]
else
@attributes.delete(self.class.primary_key)
@attributes_cache.delete(self.class.primary_key)
end
@attributes.freeze if was_frozen
end
end
end
当保存发生异常时,这个方法将之前保存的ActiveRecord
对象信息还原,restore_transaction_record_state
可以额外接受一个force
参数,以便于嵌套在其他Transaction内部的Transaction存在:require_new
选项的时候,对ActiveRecord
对象的信息进行强制恢复。
顺便要说的是,从这里我们也可以发现,只要是存储在同一个数据库上,不同的ActiveRecord
类的transaction
方法可以混用,在保存时,所有Transaction的代码都可以兼容其他对象。
最后,我们将解析save
方法,解析这个方法将使我们了解ActiveRecord
是如何将对象保存进数据库的,顺便还包括Transaction一些其他的细节和之前声明的Validator的具体运作机制。其中save
方法分多个层次,第一层定义在ActiveRecord::Transactions
中:
def save(*)
rollback_active_record_state! do
with_transaction_returning_status { super }
end
end
这里出现了两个block以包住super,最外层的方法是rollback_active_record_state!
:
# Reset id and @new_record if the transaction rolls back.
def rollback_active_record_state!
remember_transaction_record_state
yield
rescue Exception
IdentityMap.remove(self) if IdentityMap.enabled?
restore_transaction_record_state
raise
ensure
clear_transaction_record_state
end
这个方法内调用了remember_transaction_record_state
,restore_transaction_record_state
和clear_transaction_record_state
这谢方法,其中clear_transaction_record_state
和restore_transaction_record_state
方法我们已经解析,它们是用来清理或者恢复@_start_transaction_state
对象的,现在我们解析remember_transaction_record_state
方法:
# Save the new record state and id of a record so it can be restored later if a transaction fails.
def remember_transaction_record_state
@_start_transaction_state ||= {}
@_start_transaction_state[:id] = id if has_attribute?(self.class.primary_key)
unless @_start_transaction_state.include?(:new_record)
@_start_transaction_state[:new_record] = @new_record
end
unless @_start_transaction_state.include?(:destroyed)
@_start_transaction_state[:destroyed] = @destroyed
end
@_start_transaction_state[:level] = (@_start_transaction_state[:level] || 0) + 1
@_start_transaction_state[:frozen?] = @attributes.frozen?
end
该方法将当前对象和属性的状态记录进@_start_transaction_state
,以便在异常发生时部分ActiveRecord
对象发生信息丢失后恢复信息。
随后,我们将解析下一层方法with_transaction_returning_status
:
def with_transaction_returning_status
status = nil
self.class.transaction do
add_to_transaction
status = yield
raise ActiveRecord::Rollback unless status
end
status
end
在这里我们可以看到,事实上执行save
方法会再次执行transaction
方法。由此也可以知道,如果一个Transaction中仅有一个对象的保存,是必须要专门调用transaction
方法的。
在这个Transaction中,首先我们就要调用add_to_transaction
方法:
# Add the record to the current transaction so that the :after_rollback and :after_commit callbacks
# can be called.
def add_to_transaction
if self.class.connection.add_transaction_record(self)
remember_transaction_record_state
end
end
这里也由两个方法构成,首先是add_transaction_record
方法:
# Register a record with the current transaction so that its after_commit and after_rollback callbacks
# can be called.
def add_transaction_record(record)
last_batch = @_current_transaction_records.last
last_batch << record if last_batch
end
该方法将试图向@_current_transaction_records
数组的最后一个数组元素放入当前对象,以便于commit或rollback。由于再次打开了Transaction,需要再次执行remember_transaction_record_state
方法使得level加一,当然代码我们就无需再次解析了。
随后,让我们进入下一层的save
方法的实现,这层实现写在ActiveRecord::AttributeMethods::Dirty
模块中,从名字中可知,主要是对于脏数据的处理:
# Attempts to +save+ the record and clears changed attributes if successful.
def save(*)
if status = super
@previously_changed = changes
@changed_attributes.clear
elsif IdentityMap.enabled?
IdentityMap.remove(self)
end
status
end
这个方法主要是将之前对属性做过的修改赋值到@previously_changed
里,随后清理掉@changed_attributes
理的内容。随后我们再进入下一层,这层将正式执行之前声明的所有Validator,它定义在ActiveRecord::Validations
模块中:
# The validation process on save can be skipped by passing <tt>:validate => false</tt>. The regular Base#save method is
# replaced with this when the validations module is mixed in, which it is by default.
def save(options={})
perform_validations(options) ? super : false
end
主要是调用perform_validations
方法:
def perform_validations(options={})
perform_validation = options[:validate] != false
perform_validation ? valid?(options[:context]) : true
end
该首先确定是否确实需要执行Validator,然后调用valid?
正式验证,valid?
方法由两层实现,首先是ActiveRecord::Validations
本身的实现:
def valid?(context = nil)
context ||= (new_record? ? :create : :update)
output = super(context)
errors.empty? && output
end
valid?
本身接受一个context
参数以便于确定执行:create
的Validator还是:update
的,然后就调用上层实现:
# Runs all the specified validations and returns true if no errors were added
# otherwise false. Context can optionally be supplied to define which callbacks
# to test against (the context is defined on the validations using :on).
def valid?(context = nil)
current_context, self.validation_context = validation_context, context
errors.clear
run_validations!
ensure
self.validation_context = current_context
end
这里主要是对validation_context
的赋值,以及清除之前存在的错误信息,随后就调用了run_validations!
方法正式执行Validator:
# Overwrite run validations to include callbacks.
def run_validations!
run_callbacks(:validation) { super }
end
执行Validator事实上是执行:validation
的Callback,该方法定义在ActiveModel::Validations::Callbacks
中。这里给出两个:validation
相关的Callback:
def before_validation(*args, &block)
options = args.last
if options.is_a?(Hash) && options[:on]
options[:if] = Array.wrap(options[:if])
options[:if].unshift("self.validation_context == :#{options[:on]}")
end
set_callback(:validation, :before, *args, &block)
end
def after_validation(*args, &block)
options = args.extract_options!
options[:prepend] = true
options[:if] = Array.wrap(options[:if])
options[:if] << "!halted"
options[:if].unshift("self.validation_context == :#{options[:on]}") if options[:on]
set_callback(:validation, :after, *(args << options), &block)
end
可以看到,只要调用了before_validation
和after_validation
方法,便可在此时调用:validation
的前后被正式调用。由于传入了super
作为block,在此期间,run_validations!
的父类方法将被调用:
def run_validations!
run_callbacks :validate
errors.empty?
end
这里正式调用了之前定义的:validate
Callback,至此,所有Validator被执行完毕。
好了,假设Validator全部执行完毕之后,我们继续执行super
到上一层,这层实现在ActiveRecord::Persistence
中:
def save(*)
begin
create_or_update
rescue ActiveRecord::RecordInvalid
false
end
end
这里将对save
的调用变成了对create_or_update
的调用,其第一层实现在ActiveRecord::Callbacks
中:
def create_or_update
run_callbacks(:save) { super }
end
可以看到,这里将调用:save
这个Callback,before_save
,around_save
,after_save
都会围绕着这个Callback执行,继续深入便可回到ActiveRecord::Persistence
中的实现:
def create_or_update
raise ReadOnlyRecord if readonly?
result = new_record? ? create : update
result != false
end
到了这里,将区分create
和update
的调用,这里我们将进入create
的实现,update
的实现与之十分类似。create
方法的第一层依然是ActiveRecord::Callbacks
的Callback调用,这次的Callback是:create
:
def create
run_callbacks(:create) { super }
end
它的上一层就会抵达ActiveRecord::Timestamp
的实现,这层实现将为ActiveRecord
对象增加时间方面的赋值,在Rails中,总共有四个这方面的属性,:created_at, :created_on, :updated_at, :updated_on
需要在创建时赋值为当前时间:
def create
if self.record_timestamps
current_time = current_time_from_proper_timezone
all_timestamp_attributes.each do |column|
if respond_to?(column) && respond_to?("#{column}=") && self.send(column).nil?
write_attribute(column.to_s, current_time)
end
end
end
super
end
继续进入下一层ActiveRecord::Persistence
的实现:
# Creates a record with values matching those of the instance attributes
# and returns its id.
def create
attributes_values = arel_attributes_values(!id.nil?)
new_id = self.class.unscoped.insert attributes_values
self.id ||= new_id if self.class.primary_key
IdentityMap.add(self) if IdentityMap.enabled?
@new_record = false
id
end
这里首先调用arel_attributes_values
方法获取Arel的属性及其值的Hash:
# Returns a copy of the attributes hash where all the values have been safely quoted for use in
# an Arel insert/update method.
def arel_attributes_values(include_primary_key = true, include_readonly_attributes = true, attribute_names = @attributes.keys)
attrs = {}
klass = self.class
arel_table = klass.arel_table
attribute_names.each do |name|
if (column = column_for_attribute(name)) && (include_primary_key || !column.primary)
if include_readonly_attributes || !self.class.readonly_attributes.include?(name)
value = if klass.serialized_attributes.include?(name)
@attributes[name].serialized_value
else
# FIXME: we need @attributes to be used consistently.
# If the values stored in @attributes were already type
# casted, this code could be simplified
read_attribute(name)
end
attrs[arel_table[name]] = value
end
end
end
attrs
end
随后,将这个Map传入到该类的scoped
的insert
方法中:
def insert(values)
primary_key_value = nil
if primary_key && Hash === values
primary_key_value = values[values.keys.find { |k|
k.name == primary_key
}]
if !primary_key_value && connection.prefetch_primary_key?(klass.table_name)
primary_key_value = connection.next_sequence_value(klass.sequence_name)
values[klass.arel_table[klass.primary_key]] = primary_key_value
end
end
im = arel.create_insert
im.into @table
conn = @klass.connection
substitutes = values.sort_by { |arel_attr,_| arel_attr.name }
binds = substitutes.map do |arel_attr, value|
[@klass.columns_hash[arel_attr.name], value]
end
substitutes.each_with_index do |tuple, i|
tuple[1] = conn.substitute_at(binds[i][0], i)
end
if values.empty? # empty insert
im.values = Arel.sql(connection.empty_insert_statement_value)
else
im.insert substitutes
end
conn.insert(
im,
'SQL',
primary_key,
primary_key_value,
nil,
binds)
end
这个方法将负责对INSERT的SQL语句进行构建,然后执行该语句,可以任何是保存新记录时最重要的方法。首先,它查找新对象中是否已经存在主键,如果不存在的话,将询问数据库Adapter是否可以预取新对象的主键的值,如果可以,则赋值给新对象。然后,创建Arel的Insert对象,随后,将前面传入的Hash按照键名排序,将Hash重新map成由Column和值为元素构成的数组作为数据绑定,然后,将原来Hash中值的部分改成Arel的问号,最后,将Hash赋值给Arel的Insert语句对象中的数据部分,最后,调用conn.insert
方法即可完成插入语句的执行。下面将进入conn.insert
方法,首先调用insert
方法将使得query cache被删除,这段代码定义在ActiveRecord::ConnectionAdapters::QueryCache
中:
def included(base)
dirties_query_cache base, :insert, :update, :delete
end
def dirties_query_cache(base, *method_names)
method_names.each do |method_name|
base.class_eval <<-end_code, __FILE__, __LINE__ + 1
def #{method_name}(*) # def update_with_query_dirty(*args)
clear_query_cache if @query_cache_enabled # clear_query_cache if @query_cache_enabled
super # update_without_query_dirty(*args)
end # end
end_code
end
end
可以看到,insert,update,delete都将造成query cache被彻底清除。clear_query_cache
的代码非常简单:
# Clears the query cache.
#
# One reason you may wish to call this method explicitly is between queries
# that ask the database to randomize results. Otherwise the cache would see
# the same SQL query and repeatedly return the same result each time, silently
# undermining the randomness you were expecting.
def clear_query_cache
@query_cache.clear
end
然后我们正式进入insert
的实现,这个实现定义在ActiveRecord::ConnectionAdapters::DatabaseStatements
中:
def insert(arel, name = nil, pk = nil, id_value = nil, sequence_name = nil, binds = [])
sql, binds = sql_for_insert(to_sql(arel, binds), pk, id_value, sequence_name, binds)
value = exec_insert(sql, name, binds)
id_value || last_inserted_id(value)
end
对SQLite 3而言,exec_insert
和exec_query
等效,用该方法执行经过to_sql
生成的SQL语句后,即可得到返回值,可以通过last_valie_id
方法获取返回值中该新记录的id,赋值给该对象,这样一次插入的过程就已经完成了。