# Copyright 2009 - present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# PODNAME: MongoDB::DataTypes
# ABSTRACT: Using MongoDB data types with Perl
# vim: set ts=4 sts=4 sw=4 et tw=75:
__END__
=pod
=encoding UTF-8
=head1 NAME
MongoDB::DataTypes - Using MongoDB data types with Perl
=head1 VERSION
version v2.2.2
=head1 DESCRIPTION
MongoDB stores typed data in a data format called BSON
(L). This document describes how to work with BSON
data types in the MongoDB Perl driver.
As of the MongoDB Perl driver v2.0.0, the driver relies on the external
L library (and optional L library) for converting between
Perl data and the MongoDB BSON format.
=head2 Additional information
Additional information about MongoDB documents and types may be found in
the following MongoDB manual pages:
=over 4
=item *
L
=item *
L
=back
=head1 ESSENTIAL CONCEPTS
=head2 MongoDB records are ordered documents
A MongoDB record (i.e. "row") is a BSON document -- a list of key-value
pairs, like a Perl hash except that the keys in a BSON document are
ordered. Keys are always strings. Values can be any of 20+ BSON types.
Queries and update specifications are also expressed as documents.
=head2 Type wrapper classes provide non-native and disambiguated types
In order to represent BSON types that don't natively exist in Perl, we use
type wrapper classes from the L library, such as L and
L.
Wrappers for native types are available when necessary to address
limitations in Perl's type system. For example, one can use L
for a ordered hash or L for a 64-bit integer.
The L class has attributes that configure how type wrappers are used
during encoding and decoding.
The L
documentation has a detailed table of all BSON type conversions.
=head2 String/number type conversion heuristics
Perl's scalar values can have several underlying, internal representations
such as double, integer, or string (see L). When encoding to
BSON, the default behavior is as follows:
=over 4
=item *
If the value has a valid double representation, it will be encoded to BSON as a double.
=item *
Otherwise, if the value has a valid integer interpretation, it will be encoded as either Int32 or Int64; the smallest type that the value fits will be used; a value that overflows will error.
=item *
Otherwise, the value will be encoded as a UTF-8 string.
=back
The L library provides the C attribute to more
aggressively coerce number-like strings that don't already have a numeric
representation into a numeric form.
=head2 Order sometimes matters a lot
When writing a query document, the order of B level keys doesn't
matter, but the order of keys in any embedded documents does matter.
$coll->insert_one({
name => { first => "John", last => "Doe" },
age => 42,
color => "blue",
});
# Order doesn't matter here
$coll->find( { age => 42, color => "blue" } ); # MATCH
$coll->find( { color => "blue", age => 42 } ); # MATCH
# Order *does* matter here
$coll->find(
{ name => { first => "John", last => "Doe" } } # MATCH
);
$coll->find(
{ name => { last => "Doe", first => "John" } } # NO MATCH
);
When specifying a sort order or the order of keys for an index, order
matters whenever there is more than one key.
Because of Perl's hash order randomization, be very careful using
native hashes with MongoDB. See the L section below for
specific guidance.
=head1 THE BSON::TYPES LIBRARY
L is a library with helper subroutines to easily create BSON
type wrappers. Use of this library is highly recommended.
use BSON::Types ':all';
$int64 = bson_int64(42); # force encoding more bits
$decimal = bson_decimal("24.01"); # Decimal128 type
$time = bson_time(); # now
Examples in the rest of this document assume that all L
helper functions are loaded.
=head1 NOTES ON SPECIFIC TYPES
=head2 Arrays
BSON arrays encode and decode via Perl array references.
=head2 Documents
Because Perl's hashes guarantee key-order randomness, using hash references
as documents will lead to BSON documents with a different key order. For
top-level keys, this shouldn't cause problems, but it may cause problems
for embedded documents when querying, sorting or indexing on the embedded
document.
For sending data to the server, the L class provides a very
lightweight wrapper around ordered key-value pairs, but it's opaque.
$doc = bson_doc( name => "Larry", color => "purple" );
You can also use L for a more-interactive ordered document,
but at the expense of tied-object overhead.
The L encoder has an C attribute that, if enabled, returns
all documents as order-preserving tied hashes. This is slow, but is the
only way to ensure that documents can roundtrip preserving key order.
=head2 Numbers
By default, the BSON decoder decodes doubles and integers into a
Perl-native form. To maximize fidelity during a roundtrip, the decoder
supports the L attribute to always decode
to a BSON type wrapper class with numeric overloading.
=head3 32-bit Platforms
On a 32-bit platform, the L library treats L as the
"native" type for integers outside the (signed) 32-bit range. Values that
are encoded as 64-bit integers will be decoded as L objects.
=head3 64-bit Platforms
On a 64-bit platform, (signed) Int64 values are supported, but, by default,
numbers will be stored in the smallest BSON size needed. To force a 64-bit
representation for numbers in the signed 32-bit range, use a type wrapper:
$int64 = bson_int64(0); # 64 bits of 0
=head3 Long doubles
On a perl compiled with long-double support, floating point number
precision will be lost when sending data to MongoDB.
=head3 Decimal128
MongoDB 3.4 adds support for the IEEE 754 Decimal128 type. The
L class is used as a proxy for these values for both
inserting and querying documents. Be sure to use B when
constructing Decimal128 objects.
$item = {
name => "widget",
price => bson_decimal128("4.99"), # 4.99 as a string
currency => "USD",
};
$coll->insert_one($item);
=head2 Strings
String values are expected to be character-data (not bytes). They are
encoded as UTF-8 before being sent to the database and decoded from UTF-8
when received. If a string can't be decoded, an error will be thrown.
To save or query arbitrary, non-UTF8 bytes, use a binary type wrapper (see
L, below).
=head2 Booleans
Boolean values are emulated using the L package via the
C and C functions. Using L objects
in documents will ensure the documents have the BSON boolean type in the
database. Likewise, BSON boolean types in the database will be returned
as L objects.
An example of inserting boolean values:
use boolean;
$collection->insert_one({"okay" => true, "name" => "fred"});
An example of using boolean values for query operators (only returns documents
where the name field exists):
$cursor = $collection->find({"name" => {'$exists' => true}});
Often, you can just use 1 or 0 in query operations instead of C and
C, but some commands require L objects and the database
will return an error if integers 1 or 0 are used.
Boolean objects from the following JSON libraries will also be encoded
correctly in the database:
=over 4
=item *
L
=item *
L
=item *
L
=item *
L
=item *
L
=back
=head2 Object IDs
The BSON object ID type (aka "OID") is a 12 byte identifier that ensures
uniqueness by mixing a timestamp and counter with host and
process-specific bytes.
All MongoDB documents have an C<_id> field as a unique identifier. This
field does not have to be an object ID, but if the field does not exist, an
object ID is created automatically for it when the document is inserted
into the database.
The L class is the type wrapper for object IDs.
To create a unique id:
$oid = bson_oid();
To create a L from an existing 24-character hexadecimal string:
$oid = bson_oid("123456789012345678901234");
=head2 Regular Expressions
Use C to use a regular expression in a query, but be sure to limit
your regular expression to syntax and features supported by PCRE, which are
L.
$cursor = $collection->find({"name" => qr/[Jj]oh?n/});
Regular expressions will match strings saved in the database.
B: only the following flags are supported: "imsxlu".
You can also save and retrieve regular expressions themselves, but
regular expressions will be retrieved as L
objects for safety (these will round-trip correctly).
From that object, you can attempt to compile a reference to a C using
the C method. However, due to PCRE differences, this could fail
to compile or could have different match behavior than intended.
$collection->insert_one({"regex" => qr/foo/i});
$obj = $collection->find_one;
if ("FOO" =~ $obj->{regex}->try_compile) { # matches
print "hooray\n";
}
B: A regular expression can evaluate arbitrary code if C