- Retrieving Data & Results Sets
- Debugging Queries and ResultSets
- Getting a Single Entity by Primary Key
- Using Finders to Load Data
- Getting the First Result
- Getting a Count of Results
- Finding Key/Value Pairs
- Finding Threaded Data
- Custom Finder Methods
- Dynamic Finders
- Retrieving Associated Data
- Eager Loading Associations Via Contain
- Filtering by Associated Data Via Matching And Joins
- Changing Fetching Strategies
- Lazy Loading Associations
- Working with Result Sets
- Modifying Results with Map/Reduce
Retrieving Data & Results Sets
- class
Cake\ORM\
Table
- While table objects provide an abstraction around a ‘repository’ or collectionof objects, when you query for individual records you get ‘entity’ objects.While this section discusses the different ways you can find and load entities,you should read the Entities section for more information onentities.
Debugging Queries and ResultSets
Since the ORM now returns Collections and Entities, debugging these objects canbe more complicated than in previous CakePHP versions. There are now variousways to inspect the data returned by the ORM.
debug($query)
Shows the SQL and bound parameters, does not show results.sql($query)
Shows the final rendered SQL, but only when having DebugKit installed.debug($query->all())
Shows the ResultSet properties (not the results).debug($query->toList())
An easy way to show each of the results.debug(iterator_to_array($query))
Shows query results in an array format.debug(json_encode($query, JSON_PRETTY_PRINT))
More human readable results.debug($query->first())
Show the properties of a single entity.debug((string)$query->first())
Show the properties of a single entity as JSON.
Getting a Single Entity by Primary Key
Cake\ORM\Table::
get
($id, $options = [])- It is often convenient to load a single entity from the database when editing orviewing entities and their related data. You can do this by using
get()
:
- // In a controller or table method.
- // Get a single article
- $article = $articles->get($id);
- // Get a single article, and related comments
- $article = $articles->get($id, [
- 'contain' => ['Comments']
- ]);
If the get operation does not find any results aCake\Datasource\Exception\RecordNotFoundException
will be raised. You caneither catch this exception yourself, or allow CakePHP to convert it into a 404error.
Like find()
, get()
also has caching integrated. You can use thecache
option when calling get()
to perform read-through caching:
- // In a controller or table method.
- // Use any cache config or CacheEngine instance & a generated key
- $article = $articles->get($id, [
- 'cache' => 'custom',
- ]);
- // Use any cache config or CacheEngine instance & specific key
- $article = $articles->get($id, [
- 'cache' => 'custom', 'key' => 'mykey'
- ]);
- // Explicitly disable caching
- $article = $articles->get($id, [
- 'cache' => false
- ]);
Optionally you can get()
an entity using Custom Finder Methods. Forexample you may want to get all translations for an entity. You can achieve thatby using the finder
option:
- $article = $articles->get($id, [
- 'finder' => 'translations',
- ]);
Using Finders to Load Data
Cake\ORM\Table::
find
($type, $options = [])- Before you can work with entities, you’ll need to load them. The easiest way todo this is using the
find()
method. The find method provides an easy andextensible way to find the data you are interested in:
- // In a controller or table method.
- // Find all the articles
- $query = $articles->find('all');
The return value of any find()
method is alwaysa Cake\ORM\Query
object. The Query class allows you to furtherrefine a query after creating it. Query objects are evaluated lazily, and do notexecute until you start fetching rows, convert it to an array, or when theall()
method is called:
- // In a controller or table method.
- // Find all the articles.
- // At this point the query has not run.
- $query = $articles->find('all');
- // Iteration will execute the query.
- foreach ($query as $row) {
- }
- // Calling all() will execute the query
- // and return the result set.
- $results = $query->all();
- // Once we have a result set we can get all the rows
- $data = $results->toList();
- // Converting the query to a key-value array will also execute it.
- $data = $query->toArray();
Note
Once you’ve started a query you can use the Query Builderinterface to build more complex queries, adding additional conditions,limits, or include associations using the fluent interface.
- // In a controller or table method.
- $query = $articles->find('all')
- ->where(['Articles.created >' => new DateTime('-10 days')])
- ->contain(['Comments', 'Authors'])
- ->limit(10);
You can also provide many commonly used options to find()
. This can helpwith testing as there are fewer methods to mock:
- // In a controller or table method.
- $query = $articles->find('all', [
- 'conditions' => ['Articles.created >' => new DateTime('-10 days')],
- 'contain' => ['Authors', 'Comments'],
- 'limit' => 10
- ]);
The list of options supported by find() are:
conditions
provide conditions for the WHERE clause of your query.limit
Set the number of rows you want.offset
Set the page offset you want. You can also usepage
to makethe calculation simpler.contain
define the associations to eager load.fields
limit the fields loaded into the entity. Only loading some fieldscan cause entities to behave incorrectly.group
add a GROUP BY clause to your query. This is useful when usingaggregating functions.having
add a HAVING clause to your query.join
define additional custom joins.order
order the result set.
Any options that are not in this list will be passed to beforeFind listenerswhere they can be used to modify the query object. You can use thegetOptions()
method on a query object to retrieve the options used. Whileyou can pass query objects to your controllers, we recommend that you packageyour queries up as Custom Finder Methods instead. Using custom findermethods will let you re-use your queries and make testing easier.
By default queries and result sets will return Entities objects. Youcan retrieve basic arrays by disabling hydration:
- $query->enableHydration(false);
- // Prior to 3.4.0
- $query->hydrate(false);
- // $data is ResultSet that contains array data.
- $data = $query->all();
Getting the First Result
The first()
method allows you to fetch only the first row from a query. Ifthe query has not been executed, a LIMIT 1
clause will be applied:
- // In a controller or table method.
- $query = $articles->find('all', [
- 'order' => ['Articles.created' => 'DESC']
- ]);
- $row = $query->first();
This approach replaces find('first')
in previous versions of CakePHP. Youmay also want to use the get()
method if you are loading entities by primarykey.
Note
The first()
method will return null
if no results are found.
Getting a Count of Results
Once you have created a query object, you can use the count()
method to geta result count of that query:
- // In a controller or table method.
- $query = $articles->find('all', [
- 'conditions' => ['Articles.title LIKE' => '%Ovens%']
- ]);
- $number = $query->count();
See Returning the Total Count of Records for additional usage of the count()
method.
Finding Key/Value Pairs
It is often useful to generate an associative array of data from yourapplication’s data. For example, this is very useful when creating <select>
elements. CakePHP provides a simple to use method for generating ‘lists’ ofdata:
- // In a controller or table method.
- $query = $articles->find('list');
- $data = $query->toArray();
- // Data now looks like
- $data = [
- 1 => 'First post',
- 2 => 'Second article I wrote',
- ];
With no additional options the keys of $data
will be the primary key of yourtable, while the values will be the ‘displayField’ of the table. You can use thesetDisplayField()
method on a table object to configure the display field ofa table:
- class ArticlesTable extends Table
- {
- public function initialize(array $config)
- {
- $this->setDisplayField('title');
- // Prior to 3.4.0
- $this->displayField('title');
- }
- }
When calling list
you can configure the fields used for the key and valuewith the keyField
and valueField
options respectively:
- // In a controller or table method.
- $query = $articles->find('list', [
- 'keyField' => 'slug',
- 'valueField' => 'title'
- ]);
- $data = $query->toArray();
- // Data now looks like
- $data = [
- 'first-post' => 'First post',
- 'second-article-i-wrote' => 'Second article I wrote',
- ];
Results can be grouped into nested sets. This is useful when you wantbucketed sets, or want to build <optgroup>
elements with FormHelper:
- // In a controller or table method.
- $query = $articles->find('list', [
- 'keyField' => 'slug',
- 'valueField' => 'title',
- 'groupField' => 'author_id'
- ]);
- $data = $query->toArray();
- // Data now looks like
- $data = [
- 1 => [
- 'first-post' => 'First post',
- 'second-article-i-wrote' => 'Second article I wrote',
- ],
- 2 => [
- // More data.
- ]
- ];
You can also create list data from associations that can be reached with joins:
- $query = $articles->find('list', [
- 'keyField' => 'id',
- 'valueField' => 'author.name'
- ])->contain(['Authors']);
Customize Key-Value Output
Lastly it is possible to use closures to access entity accessor methods in yourlist finds.
- // In your Authors Entity create a virtual field to be used as the displayField:
- protected function _getLabel()
- {
- return $this->_properties['first_name'] . ' ' . $this->_properties['last_name']
- . ' / ' . __('User ID %s', $this->_properties['user_id']);
- }
This example shows using the _getLabel()
accessor method fromthe Author entity.
- // In your finders/controller:
- $query = $articles->find('list', [
- 'keyField' => 'id',
- 'valueField' => function ($article) {
- return $article->author->get('label');
- }
- ]);
You can also fetch the label in the list directly using.
- // In AuthorsTable::initialize():
- $this->setDisplayField('label'); // Will utilize Author::_getLabel()
- // In your finders/controller:
- $query = $authors->find('list'); // Will utilize AuthorsTable::getDisplayField()
Finding Threaded Data
The find('threaded')
finder returns nested entities that are threadedtogether through a key field. By default this field is parent_id
. Thisfinder allows you to access data stored in an ‘adjacency list’ style table. Allentities matching a given parent_id
are placed under the children
attribute:
- // In a controller or table method.
- $query = $comments->find('threaded');
- // Expanded default values
- $query = $comments->find('threaded', [
- 'keyField' => $comments->primaryKey(),
- 'parentField' => 'parent_id'
- ]);
- $results = $query->toArray();
- echo count($results[0]->children);
- echo $results[0]->children[0]->comment;
The parentField
and keyField
keys can be used to define the fields thatthreading will occur on.
Tip
If you need to manage more advanced trees of data, consider usingTree instead.
Custom Finder Methods
The examples above show how to use the built-in all
and list
finders.However, it is possible and recommended that you implement your own findermethods. Finder methods are the ideal way to package up commonly used queries,allowing you to abstract query details into a simple to use method. Findermethods are defined by creating methods following the convention of findFoo
where Foo
is the name of the finder you want to create. For example if wewanted to add a finder to our articles table for finding published articles wewould do the following:
- use Cake\ORM\Query;
- use Cake\ORM\Table;
- class ArticlesTable extends Table
- {
- public function findOwnedBy(Query $query, array $options)
- {
- $user = $options['user'];
- return $query->where(['author_id' => $user->id]);
- }
- }
- // In a controller or table method.
- $articles = TableRegistry::get('Articles');
- $query = $articles->find('ownedBy', ['user' => $userEntity]);
Finder methods can modify the query as required, or use the $options
tocustomize the finder operation with relevant application logic. You can also‘stack’ finders, allowing you to express complex queries effortlessly. Assumingyou have both the ‘published’ and ‘recent’ finders, you could do the following:
- // In a controller or table method.
- $articles = TableRegistry::get('Articles');
- $query = $articles->find('published')->find('recent');
While all the examples so far have shown finder methods on table classes, findermethods can also be defined on Behaviors.
If you need to modify the results after they have been fetched you should usea Modifying Results with Map/Reduce function to modify the results. The map reduce featuresreplace the ‘afterFind’ callback found in previous versions of CakePHP.
Note
Passing arguments exposed in the config array,$products->find('sizes', ['large', 'medium'])
can give unexpected results when chainingcustom finders. Always pass options as an associative array,$products->find('sizes', ['values' => ['large', 'medium']])
Dynamic Finders
CakePHP’s ORM provides dynamically constructed finder methods which allow you toexpress simple queries with no additional code. For example if you wanted tofind a user by username you could do:
- // In a controller
- // The following two calls are equal.
- $query = $this->Users->findByUsername('joebob');
- $query = $this->Users->findAllByUsername('joebob');
- // In a table method
- $users = TableRegistry::get('Users');
- // The following two calls are equal.
- $query = $users->findByUsername('joebob');
- $query = $users->findAllByUsername('joebob');
When using dynamic finders you can constrain on multiple fields:
- $query = $users->findAllByUsernameAndApproved('joebob', 1);
You can also create OR
conditions:
- $query = $users->findAllByUsernameOrEmail('joebob', 'joe@example.com');
While you can use either OR
or AND
conditions, you cannot combine thetwo in a single dynamic finder. Other query options like contain
are alsonot supported with dynamic finders. You should use Custom Finder Methods toencapsulate more complex queries. Lastly, you can also combine dynamic finderswith custom finders:
- $query = $users->findTrollsByUsername('bro');
The above would translate into the following:
- $users->find('trolls', [
- 'conditions' => ['username' => 'bro']
- ]);
Once you have a query object from a dynamic finder, you’ll need to callfirst()
if you want the first result.
Note
While dynamic finders make it simple to express queries, they add a smallamount of overhead. You cannot call findBy
methods from a query object.When using a finder chain the dynamic finder must be called first.
Retrieving Associated Data
When you want to grab associated data, or filter based on associated data, thereare two ways:
- use CakePHP ORM query functions like
contain()
andmatching()
- use join functions like
innerJoin()
,leftJoin()
, andrightJoin()
You should usecontain()
when you want to load the primary model, and itsassociated data. Whilecontain()
will let you apply additional conditions tothe loaded associations, you cannot constrain the primary model based on theassociations. For more details on thecontain()
, look atEager Loading Associations Via Contain.
You should use matching()
when you want to restrict the primary model basedon associations. For example, you want to load all the articles that havea specific tag on them. For more details on the matching()
, look atFiltering by Associated Data Via Matching And Joins.
If you prefer to use join functions, you can look atAdding Joins for more information.
Eager Loading Associations Via Contain
By default CakePHP does not load any associated data when using find()
.You need to ‘contain’ or eager-load each association you want loaded in yourresults.
Eager loading helps avoid many of the potential performance problemssurrounding lazy-loading in an ORM. The queries generated by eager loading canbetter leverage joins, allowing more efficient queries to be made. In CakePHPyou define eager loaded associations using the ‘contain’ method:
- // In a controller or table method.
- // As an option to find()
- $query = $articles->find('all', ['contain' => ['Authors', 'Comments']]);
- // As a method on the query object
- $query = $articles->find('all');
- $query->contain(['Authors', 'Comments']);
The above will load the related author and comments for each article in theresult set. You can load nested associations using nested arrays to define theassociations to be loaded:
- $query = $articles->find()->contain([
- 'Authors' => ['Addresses'], 'Comments' => ['Authors']
- ]);
Alternatively, you can express nested associations using the dot notation:
- $query = $articles->find()->contain([
- 'Authors.Addresses',
- 'Comments.Authors'
- ]);
You can eager load associations as deep as you like:
- $query = $products->find()->contain([
- 'Shops.Cities.Countries',
- 'Shops.Managers'
- ]);
You can select fields from all associations with multiple easy contain()
statements:
- $query = $this->find()->select([
- 'Realestates.id',
- 'Realestates.title',
- 'Realestates.description'
- ])
- ->contain([
- 'RealestateAttributes' => [
- 'Attributes' => [
- 'fields' => [
- // Aliased fields in contain() must include
- // the model prefix to be mapped correctly.
- 'Attributes__name' => 'attr_name'
- ]
- ]
- ]
- ])
- ->contain([
- 'RealestateAttributes' => [
- 'fields' => [
- 'RealestateAttributes.realestate_id',
- 'RealestateAttributes.value'
- ]
- ]
- ])
- ->where($condition);
If you need to reset the containments on a query you can set the second argumentto true
:
- $query = $articles->find();
- $query->contain(['Authors', 'Comments'], true);
Passing Conditions to Contain
When using contain()
you are able to restrict the data returned by theassociations and filter them by conditions. To specify conditions, pass an anonymousfunction that receives as the first argument a query object, \Cake\ORM\Query
:
- // In a controller or table method.
- // Prior to 3.5.0 you would use contain(['Comments' => function () { ... }])
- $query = $articles->find()->contain('Comments', function (Query $q) {
- return $q
- ->select(['body', 'author_id'])
- ->where(['Comments.approved' => true]);
- });
This also works for pagination at the Controller level:
- $this->paginate['contain'] = [
- 'Comments' => function (Query $query) {
- return $query->select(['body', 'author_id'])
- ->where(['Comments.approved' => true]);
- }
- ];
Note
When you limit the fields that are fetched from an association, you mustensure that the foreign key columns are selected. Failing to select foreignkey fields will cause associated data to not be present in the final result.
It is also possible to restrict deeply-nested associations using the dotnotation:
- $query = $articles->find()->contain([
- 'Comments',
- 'Authors.Profiles' => function (Query $q) {
- return $q->where(['Profiles.is_published' => true]);
- }
- ]);
In the above example, you’ll still get authors even if they don’t havea published profile. To only get authors with a published profile usematching(). If you have defined customfinders in your associations, you can use them inside contain()
:
- // Bring all articles, but only bring the comments that are approved and
- // popular.
- $query = $articles->find()->contain('Comments', function (Query $q) {
- return $q->find('approved')->find('popular');
- });
Note
For BelongsTo
and HasOne
associations only the where
andselect
clauses are used when loading the associated records. For therest of the association types you can use every clause that the query objectprovides.
If you need full control over the query that is generated, you can tell contain()
to not append the foreignKey
constraints to the generated query. In thatcase you should use an array passing foreignKey
and queryBuilder
:
- $query = $articles->find()->contain([
- 'Authors' => [
- 'foreignKey' => false,
- 'queryBuilder' => function (Query $q) {
- return $q->where(...); // Full conditions for filtering
- }
- ]
- ]);
If you have limited the fields you are loading with select()
but also want toload fields off of contained associations, you can pass the association objectto select()
:
- // Select id & title from articles, but all fields off of Users.
- $query = $articles->find()
- ->select(['id', 'title'])
- ->select($articles->Users)
- ->contain(['Users']);
Alternatively, if you have multiple associations, you can use enableAutoFields()
:
- // Select id & title from articles, but all fields off of Users, Comments
- // and Tags.
- $query->select(['id', 'title'])
- ->contain(['Comments', 'Tags'])
- ->enableAutoFields(true) // Prior to 3.4.0 use autoFields(true)
- ->contain(['Users' => function(Query $q) {
- return $q->autoFields(true);
- }]);
New in version 3.1: Selecting columns via an association object was added in 3.1
Sorting Contained Associations
When loading HasMany and BelongsToMany associations, you can use the sort
option to sort the data in those associations:
- $query->contain([
- 'Comments' => [
- 'sort' => ['Comments.created' => 'DESC']
- ]
- ]);
Filtering by Associated Data Via Matching And Joins
A fairly common query case with associations is finding records ‘matching’specific associated data. For example if you have ‘Articles belongsToMany Tags’you will probably want to find Articles that have the CakePHP tag. This isextremely simple to do with the ORM in CakePHP:
- // In a controller or table method.
- $query = $articles->find();
- $query->matching('Tags', function ($q) {
- return $q->where(['Tags.name' => 'CakePHP']);
- });
You can apply this strategy to HasMany associations as well. For example if‘Authors HasMany Articles’, you could find all the authors with recentlypublished articles using the following:
- $query = $authors->find();
- $query->matching('Articles', function ($q) {
- return $q->where(['Articles.created >=' => new DateTime('-10 days')]);
- });
Filtering by deep associations is surprisingly easy, and the syntax should bealready familiar to you:
- // In a controller or table method.
- $query = $products->find()->matching(
- 'Shops.Cities.Countries', function ($q) {
- return $q->where(['Countries.name' => 'Japan']);
- }
- );
- // Bring unique articles that were commented by 'markstory' using passed variable
- // Dotted matching paths should be used over nested matching() calls
- $username = 'markstory';
- $query = $articles->find()->matching('Comments.Users', function ($q) use ($username) {
- return $q->where(['username' => $username]);
- });
Note
As this function will create an INNER JOIN
, you might want to considercalling distinct
on the find query as you might get duplicate rows ifyour conditions don’t exclude them already. This might be the case, forexample, when the same users comments more than once on a single article.
The data from the association that is ‘matched’ will be available on the_matchingData
property of entities. If both match and contain the sameassociation, you can expect to get both the _matchingData
and standardassociation properties in your results.
Using innerJoinWith
Using the matching()
function, as we saw already, will create an INNER
with the specified association and will also load the fields into theresult set.
JOIN
There may be cases where you want to use matching()
but are not interestedin loading the fields into the result set. For this purpose, you can useinnerJoinWith()
:
- $query = $articles->find();
- $query->innerJoinWith('Tags', function ($q) {
- return $q->where(['Tags.name' => 'CakePHP']);
- });
The innerJoinWith()
method works the same as matching()
, thatmeans that you can use dot notation to join deeply nestedassociations:
- $query = $products->find()->innerJoinWith(
- 'Shops.Cities.Countries', function ($q) {
- return $q->where(['Countries.name' => 'Japan']);
- }
- );
Again, the only difference is that no additional columns will be added to theresult set, and no _matchingData
property will be set.However, it is possible to combine innerJoinWith()
and contain()
when you need to filter by associate data and you want also to retrieve associate fields too (following the same filter):
- $filter = ['Tags.name' => 'CakePHP'];
- $query = $articles->find()
- ->distinct($articles)
- ->contain('Tags', function (\Cake\ORM\Query $q) use ($filter) {
- return $q->where($filter);
- })
- ->innerJoinWith('Tags', function (\Cake\ORM\Query $q) use ($filter) {
- return $q->where($filter);
- });
New in version 3.1: Query::innerJoinWith() was added in 3.1
Using notMatching
The opposite of matching()
is notMatching()
. This function will changethe query so that it filters results that have no relation to the specifiedassociation:
- // In a controller or table method.
- $query = $articlesTable
- ->find()
- ->notMatching('Tags', function ($q) {
- return $q->where(['Tags.name' => 'boring']);
- });
The above example will find all articles that were not tagged with the wordboring
. You can apply this method to HasMany associations as well. You could,for example, find all the authors with no published articles in the last 10days:
- $query = $authorsTable
- ->find()
- ->notMatching('Articles', function ($q) {
- return $q->where(['Articles.created >=' => new \DateTime('-10 days')]);
- });
It is also possible to use this method for filtering out records not matchingdeep associations. For example, you could find articles that have not beencommented on by a certain user:
- $query = $articlesTable
- ->find()
- ->notMatching('Comments.Users', function ($q) {
- return $q->where(['username' => 'jose']);
- });
Since articles with no comments at all also satisfy the condition above, you maywant to combine matching()
and notMatching()
in the same query. Thefollowing example will find articles having at least one comment, but notcommented by a certain user:
- $query = $articlesTable
- ->find()
- ->notMatching('Comments.Users', function ($q) {
- return $q->where(['username' => 'jose']);
- })
- ->matching('Comments');
Note
As notMatching()
will create a LEFT JOIN
, you might want to considercalling distinct
on the find query as you can get duplicate rowsotherwise.
Keep in mind that contrary to the matching()
function, notMatching()
will not add any data to the _matchingData
property in the results.
New in version 3.1: Query::notMatching() was added in 3.1
Using leftJoinWith
On certain occasions you may want to calculate a result based on an association,without having to load all the records for it. For example, if you wanted toload the total number of comments an article has along with all the articledata, you can use the leftJoinWith()
function:
- $query = $articlesTable->find();
- $query->select(['total_comments' => $query->func()->count('Comments.id')])
- ->leftJoinWith('Comments')
- ->group(['Articles.id'])
- ->enableAutoFields(true); // Prior to 3.4.0 use autoFields(true);
The results for the above query will contain the article data and thetotal_comments
property for each of them.
leftJoinWith()
can also be used with deeply nested associations. This isuseful, for example, for bringing the count of articles tagged with a certainword, per author:
- $query = $authorsTable
- ->find()
- ->select(['total_articles' => $query->func()->count('Articles.id')])
- ->leftJoinWith('Articles.Tags', function ($q) {
- return $q->where(['Tags.name' => 'awesome']);
- })
- ->group(['Authors.id'])
- ->enableAutoFields(true); // Prior to 3.4.0 use autoFields(true);
This function will not load any columns from the specified associations into theresult set.
New in version 3.1: Query::leftJoinWith() was added in 3.1
Changing Fetching Strategies
As you may know already, belongsTo
and hasOne
associations are loadedusing a JOIN
in the main finder query. While this improves query andfetching speed and allows for creating more expressive conditions whenretrieving data, this may be a problem when you want to apply certain clauses tothe finder query for the association, such as order()
or limit()
.
For example, if you wanted to get the first comment of an article as anassociation:
- $articles->hasOne('FirstComment', [
- 'className' => 'Comments',
- 'foreignKey' => 'article_id'
- ]);
In order to correctly fetch the data from this association, we will need to tellthe query to use the select
strategy, since we want order by a particularcolumn:
- $query = $articles->find()->contain([
- 'FirstComment' => [
- 'strategy' => 'select',
- 'queryBuilder' => function ($q) {
- return $q->order(['FirstComment.created' =>'ASC'])->limit(1);
- }
- ]
- ]);
Dynamically changing the strategy in this way will only apply to a specificquery. If you want to make the strategy change permanent you can do:
- $articles->FirstComment->setStrategy('select');
- // Prior to 3.4.0
- $articles->FirstComment->strategy('select');
Using the select
strategy is also a great way of making associations withtables in another database, since it would not be possible to fetch recordsusing joins
.
Fetching With The Subquery Strategy
As your tables grow in size, fetching associations from them can becomeslower, especially if you are querying big batches at once. A good way ofoptimizing association loading for hasMany
and belongsToMany
associations is by using the subquery
strategy:
- $query = $articles->find()->contain([
- 'Comments' => [
- 'strategy' => 'subquery',
- 'queryBuilder' => function ($q) {
- return $q->where(['Comments.approved' => true]);
- }
- ]
- ]);
The result will remain the same as with using the default strategy, but thiscan greatly improve the query and fetching time in some databases, inparticular it will allow to fetch big chunks of data at the same time indatabases that limit the amount of bound parameters per query, such asMicrosoft SQL Server.
You can also make the strategy permanent for the association by doing:
- $articles->Comments->setStrategy('subquery');
- // Prior to 3.4.0
- $articles->Comments->strategy('subquery');
Lazy Loading Associations
While CakePHP makes it easy to eager load your associations, there may be caseswhere you need to lazy-load associations. You should refer to theLazy Loading Associations and Loading Additional Associationssections for more information.
Working with Result Sets
Once a query is executed with all()
, you will get an instance ofCake\ORM\ResultSet
. This object offers powerful ways to manipulatethe resulting data from your queries. Like Query objects, ResultSets area Collection and you can use any collectionmethod on ResultSet objects.
Result set objects will lazily load rows from the underlying prepared statement.By default results will be buffered in memory allowing you to iterate a resultset multiple times, or cache and iterate the results. If you need work witha data set that does not fit into memory you can disable buffering on the queryto stream results:
- $query->enableBufferedResults(false);
- // Prior to 3.4.0
- $query->bufferResults(false);
Turning buffering off has a few caveats:
- You will not be able to iterate a result set more than once.
- You will also not be able to iterate & cache the results.
- Buffering cannot be disabled for queries that eager load hasMany orbelongsToMany associations, as these association types require eagerlyloading all results so that dependent queries can be generated.
Warning
Streaming results will still allocate memory for the entire results whenusing PostgreSQL and SQL Server. This is due to limitations in PDO.
Result sets allow you to cache/serialize or JSON encode results for APIresults:
- // In a controller or table method.
- $results = $query->all();
- // Serialized
- $serialized = serialize($results);
- // Json
- $json = json_encode($results);
Both serializing and JSON encoding result sets work as you would expect. Theserialized data can be unserialized into a working result set. Converting toJSON respects hidden & virtual field settings on all entity objectswithin a result set.
In addition to making serialization easy, result sets are a ‘Collection’ object andsupport the same methods that collection objectsdo. For example, you can extract a list of unique tags on a collection ofarticles by running:
- // In a controller or table method.
- $articles = TableRegistry::get('Articles');
- $query = $articles->find()->contain(['Tags']);
- $reducer = function ($output, $value) {
- if (!in_array($value, $output)) {
- $output[] = $value;
- }
- return $output;
- };
- $uniqueTags = $query->all()
- ->extract('tags.name')
- ->reduce($reducer, []);
Some other examples of the collection methods being used with result sets are:
- // Filter the rows by a calculated property
- $filtered = $results->filter(function ($row) {
- return $row->is_recent;
- });
- // Create an associative array from result properties
- $articles = TableRegistry::get('Articles');
- $results = $articles->find()->contain(['Authors'])->all();
- $authorList = $results->combine('id', 'author.name');
The Collections chapter has more detail on what can bedone with result sets using the collections features. The Adding Calculated Fieldssection show how you can add calculated fields, or replace the result set.
Getting the First & Last Record From a ResultSet
You can use the first()
and last()
methods to get the respective recordsfrom a result set:
- $result = $articles->find('all')->all();
- // Get the first and/or last result.
- $row = $result->first();
- $row = $result->last();
Getting an Arbitrary Index From a ResultSet
You can use skip()
and first()
to get an arbitrary record froma ResultSet:
- $result = $articles->find('all')->all();
- // Get the 5th record
- $row = $result->skip(4)->first();
Checking if a Query or ResultSet is Empty
You can use the isEmpty()
method on a Query or ResultSet object to see if ithas any rows in it. Calling isEmpty()
on a Query object will evaluate thequery:
- // Check a query.
- $query->isEmpty();
- // Check results
- $results = $query->all();
- $results->isEmpty();
Loading Additional Associations
Once you’ve created a result set, you may need to loadadditional associations. This is the perfect time to lazily eager load data. Youcan load additional associations using loadInto()
:
- $articles = $this->Articles->find()->all();
- $withMore = $this->Articles->loadInto($articles, ['Comments', 'Users']);
You can eager load additional data into a single entity, or a collection ofentities.
Modifying Results with Map/Reduce
More often than not, find operations require post-processing the data that isfound in the database. While entities’ getter methods can take care of most ofthe virtual field generation or special data formatting, sometimes youneed to change the data structure in a more fundamental way.
For those cases, the Query
object offers the mapReduce()
method, whichis a way of processing results once they are fetched from the database.
A common example of changing the data structure is grouping results togetherbased on certain conditions. For this task we can use the mapReduce()
function. We need two callable functions the $mapper
and the $reducer
.The $mapper
callable receives the current result from the database as firstargument, the iteration key as second argument and finally it receives aninstance of the MapReduce
routine it is running:
- $mapper = function ($article, $key, $mapReduce) {
- $status = 'published';
- if ($article->isDraft() || $article->isInReview()) {
- $status = 'unpublished';
- }
- $mapReduce->emitIntermediate($article, $status);
- };
In the above example $mapper
is calculating the status of an article, eitherpublished or unpublished, then it calls emitIntermediate()
on theMapReduce
instance. This method stores the article in the list of articleslabelled as either published or unpublished.
The next step in the map-reduce process is to consolidate the final results. Foreach status created in the mapper, the $reducer
function will be called soyou can do any extra processing. This function will receive the list of articlesin a particular “bucket” as the first parameter, the name of the “bucket” itneeds to process as the second parameter, and again, as in the mapper()
function, the instance of the MapReduce
routine as the third parameter. Inour example, we did not have to do any extra processing, so we just emit()
the final results:
- $reducer = function ($articles, $status, $mapReduce) {
- $mapReduce->emit($articles, $status);
- };
Finally, we can put these two functions together to do the grouping:
- $articlesByStatus = $articles->find()
- ->where(['author_id' => 1])
- ->mapReduce($mapper, $reducer);
- foreach ($articlesByStatus as $status => $articles) {
- echo sprintf("There are %d %s articles", count($articles), $status);
- }
The above will ouput the following lines:
- There are 4 published articles
- There are 5 unpublished articles
Of course, this is a simplistic example that could actually be solved in anotherway without the help of a map-reduce process. Now, let’s take a look at anotherexample in which the reducer function will be needed to do something more thanjust emitting the results.
Calculating the most commonly mentioned words, where the articles containinformation about CakePHP, as usual we need a mapper function:
- $mapper = function ($article, $key, $mapReduce) {
- if (stripos($article['body'], 'cakephp') === false) {
- return;
- }
- $words = array_map('strtolower', explode(' ', $article['body']));
- foreach ($words as $word) {
- $mapReduce->emitIntermediate($article['id'], $word);
- }
- };
It first checks for whether the “cakephp” word is in the article’s body, andthen breaks the body into individual words. Each word will create its ownbucket
where each article id will be stored. Now let’s reduce our results toonly extract the count:
- $reducer = function ($occurrences, $word, $mapReduce) {
- $mapReduce->emit(count($occurrences), $word);
- }
Finally, we put everything together:
- $wordCount = $articles->find()
- ->where(['published' => true])
- ->andWhere(['published_date >=' => new DateTime('2014-01-01')])
- ->enableHydrate(false) // Prior to 3.4.0 use hydrate(false)
- ->mapReduce($mapper, $reducer)
- ->toArray();
This could return a very large array if we don’t clean stop words, but it couldlook something like this:
- [
- 'cakephp' => 100,
- 'awesome' => 39,
- 'impressive' => 57,
- 'outstanding' => 10,
- 'mind-blowing' => 83
- ]
One last example and you will be a map-reduce expert. Imagine you havea friends
table and you want to find “fake friends” in our database, orbetter said, people who do not follow each other. Let’s start with ourmapper()
function:
- $mapper = function ($rel, $key, $mr) {
- $mr->emitIntermediate($rel['target_user_id'], $rel['source_user_id']);
- $mr->emitIntermediate(-$rel['source_user_id'], $rel['target_user_id']);
- };
The intermediate array will be like the following:
- [
- 1 => [2, 3, 4, 5, -3, -5],
- 2 => [-1],
- 3 => [-1, 1, 6],
- 4 => [-1],
- 5 => [-1, 1],
- 6 => [-3],
- ...
- ]
Positive numbers mean that a user, indicated with the first-level key, isfollowing them, and negative numbers mean that the user is followed by them.
Now it’s time to reduce it. For each call to the reducer, it will receive a listof followers per user:
- $reducer = function ($friends, $user, $mr) {
- $fakeFriends = [];
- foreach ($friends as $friend) {
- if ($friend > 0 && !in_array(-$friend, $friends)) {
- $fakeFriends[] = $friend;
- }
- }
- if ($fakeFriends) {
- $mr->emit($fakeFriends, $user);
- }
- };
And we supply our functions to a query:
- $fakeFriends = $friends->find()
- ->enableHydrate(false) // Prior to 3.4.0 use hydrate(false)
- ->mapReduce($mapper, $reducer)
- ->toArray();
This would return an array similar to this:
- [
- 1 => [2, 4],
- 3 => [6]
- ...
- ]
The resulting array means, for example, that user with id 1
follows users2
and 4
, but those do not follow 1
back.
Stacking Multiple Operations
Using mapReduce
in a query will not execute it immediately. The operation willbe registered to be run as soon as the first result is attempted to be fetched.This allows you to keep chaining additional methods and filters to the queryeven after adding a map-reduce routine:
- $query = $articles->find()
- ->where(['published' => true])
- ->mapReduce($mapper, $reducer);
- // At a later point in your app:
- $query->where(['created >=' => new DateTime('1 day ago')]);
This is particularly useful for building custom finder methods as described in theCustom Finder Methods section:
- public function findPublished(Query $query, array $options)
- {
- return $query->where(['published' => true]);
- }
- public function findRecent(Query $query, array $options)
- {
- return $query->where(['created >=' => new DateTime('1 day ago')]);
- }
- public function findCommonWords(Query $query, array $options)
- {
- // Same as in the common words example in the previous section
- $mapper = ...;
- $reducer = ...;
- return $query->mapReduce($mapper, $reducer);
- }
- $commonWords = $articles
- ->find('commonWords')
- ->find('published')
- ->find('recent');
Moreover, it is also possible to stack more than one mapReduce
operation fora single query. For example, if we wanted to have the most commonly used wordsfor articles, but then filter it to only return words that were mentioned morethan 20 times across all articles:
- $mapper = function ($count, $word, $mr) {
- if ($count > 20) {
- $mr->emit($count, $word);
- }
- };
- $articles->find('commonWords')->mapReduce($mapper);
Removing All Stacked Map-reduce Operations
Under some circumstances you may want to modify a Query
object so that nomapReduce
operations are executed at all. This can be done bycalling the method with both parameters as null and the third parameter(overwrite) as true
:
- $query->mapReduce(null, null, true);