class REXML::Source
A Source
can be searched for patterns, and wraps buffers and other objects and provides consumption of text
Attributes
The current buffer (what we’re going to read next)
The line number of the last consumed text
Public Class Methods
Constructor @param arg must be a String, and should be a valid XML document @param encoding if non-null, sets the encoding of the source to this value, overriding all encoding detection
# File rexml-3.2.6/lib/rexml/source.rb, line 43 def initialize(arg, encoding=nil) @orig = @buffer = arg if encoding self.encoding = encoding else detect_encoding end @line = 0 end
Public Instance Methods
# File rexml-3.2.6/lib/rexml/source.rb, line 87 def consume( pattern ) @buffer = $' if pattern.match( @buffer ) end
@return the current line in the source
# File rexml-3.2.6/lib/rexml/source.rb, line 117 def current_line lines = @orig.split res = lines.grep @buffer[0..30] res = res[-1] if res.kind_of? Array lines.index( res ) if res end
@return true if the Source
is exhausted
# File rexml-3.2.6/lib/rexml/source.rb, line 108 def empty? @buffer == "" end
Inherited from Encoding
Overridden to support optimized en/decoding
REXML::Encoding#encoding=
# File rexml-3.2.6/lib/rexml/source.rb, line 56 def encoding=(enc) return unless super encoding_updated end
# File rexml-3.2.6/lib/rexml/source.rb, line 101 def match(pattern, cons=false) md = pattern.match(@buffer) @buffer = $' if cons and md return md end
# File rexml-3.2.6/lib/rexml/source.rb, line 91 def match_to( char, pattern ) return pattern.match(@buffer) end
# File rexml-3.2.6/lib/rexml/source.rb, line 95 def match_to_consume( char, pattern ) md = pattern.match(@buffer) @buffer = $' return md end
# File rexml-3.2.6/lib/rexml/source.rb, line 112 def position @orig.index( @buffer ) end
# File rexml-3.2.6/lib/rexml/source.rb, line 84 def read end
Scans the source for a given pattern. Note, that this is not your usual scan() method. For one thing, the pattern argument has some requirements; for another, the source can be consumed. You can easily confuse this method. Originally, the patterns were easier to construct and this method more robust, because this method generated search regexps on the fly; however, this was computationally expensive and slowed down the entire REXML
package considerably, since this is by far the most commonly called method. @param pattern must be a Regexp, and must be in the form of /^s*(#{your pattern, with no groups})(.*)/. The first group will be returned; the second group is used if the consume flag is set. @param consume if true, the pattern returned will be consumed, leaving everything after it in the Source
. @return the pattern, if found, or nil if the Source
is empty or the pattern is not found.
# File rexml-3.2.6/lib/rexml/source.rb, line 77 def scan(pattern, cons=false) return nil if @buffer.nil? rv = @buffer.scan(pattern) @buffer = $' if cons and rv.size>0 rv end
Private Instance Methods
# File rexml-3.2.6/lib/rexml/source.rb, line 125 def detect_encoding buffer_encoding = @buffer.encoding detected_encoding = "UTF-8" begin @buffer.force_encoding("ASCII-8BIT") if @buffer[0, 2] == "\xfe\xff" @buffer[0, 2] = "" detected_encoding = "UTF-16BE" elsif @buffer[0, 2] == "\xff\xfe" @buffer[0, 2] = "" detected_encoding = "UTF-16LE" elsif @buffer[0, 3] == "\xef\xbb\xbf" @buffer[0, 3] = "" detected_encoding = "UTF-8" end ensure @buffer.force_encoding(buffer_encoding) end self.encoding = detected_encoding end
# File rexml-3.2.6/lib/rexml/source.rb, line 146 def encoding_updated if @encoding != 'UTF-8' @buffer = decode(@buffer) @to_utf = true else @to_utf = false @buffer.force_encoding ::Encoding::UTF_8 end end