Skip to content
This repository has been archived by the owner on Jan 30, 2020. It is now read-only.

Support LDAP paging control #41

Open
bishopb opened this issue Apr 13, 2016 · 14 comments
Open

Support LDAP paging control #41

bishopb opened this issue Apr 13, 2016 · 14 comments

Comments

@bishopb
Copy link

bishopb commented Apr 13, 2016

Originally requested at zendframework/zendframework#2780. Now that ZF requires PHP 5.5, we can rely on ldap_control_paged_result to send the paging control option and consume all entries. It's simply a matter of implementing it within the API.

A straightforward way that returns a complete array after pulling all pages follows. This solution improves on a similar one posted on SO, by reusing the base methods as much as possible. However, I feel that:

  • This is hackish. It jumps through hoops to access the result ID and it conflates the idea of "desired maximum number of entries" (sizelimit) with "number of entries to request per page".
  • A final solution should include an Iterator for pages, with an InnerIterator for entries.

Unfortunately, I've not had luck with a design that addresses these concerns. Ideas welcome.

    public function searchAllEntries(
        $filter, $basedn = null, $scope = self::SEARCH_SCOPE_SUB,
        array $attributes = [], $sort = null, $collectionClass = null,
        $sizelimit = 0, $timelimit = 0
    )
    {
        // calculate page size, then remove it from the filter as we need
        // ldap_search to grab as many as it can, per page
        $pagesz = ($filter['sizelimit'] ?: ($sizelimit ?: 500));
        unset($filter['sizelimit']);

        // run the loop, grabbing pages
        $ldap   = $this->getResource();
        $result = [];
        $cookie = '';
        ErrorHandler::start(E_WARNING);
        ldap_set_option($ldap, LDAP_OPT_PROTOCOL_VERSION, 3);
        do {
            ldap_control_paged_result($ldap, $pagesz, true, $cookie);

            // run the search and add in the results
            $collection = $this->search(
                $filter, $basedn, $scope, $attributes, $sort, $collectionClass,
                null, // no size limit: we need maximum per page
                $timelimit
            );
            $result = array_merge($result, $collection->toArray());

            // get that pesky search result resource handle, which is a
            // private member of the collection's inner iterator
            $inner = $collection->getInnerIterator();
            $steal = function (DefaultIterator $it) { return $it->resultId; };
            $steal = \Closure::bind($steal, null, $inner);
            $entry = $steal($inner);

            // get the next page cookie
            ldap_control_paged_result_response($ldap, $entry, $cookie);
        } while (! empty($cookie));

        // ensure that subsequent reads don't attempt to use paging
        ldap_control_paged_result($ldap, 0);
        ErrorHandler::stop();

        return $result;
    }
@heiglandreas
Copy link
Member

I find this approach rather difficult as it queries all search results from an LDAP server. As that might easily be multiple thousands of entries that have to be handled in memory I'd rather not encourage use of this scenario! In my eyes getting that many results shows that you should refine your search criteria.

I can think of something that might use actual paged results by subsequently returning one page after another for each request. So for the first call to a function you'd get the first page and for the next call you'd get the next page etc.

Is that what you had in mind?

On the other side currently the paged results are a bit limited due to the missing server-sided sorting (which is currently under development but I'm not sure it'l be in PHP7.1)

@bishopb
Copy link
Author

bishopb commented Apr 13, 2016

Is that what you had in mind?

Yes, along the lines of searchPages() would return a ContainerIterator, which on each iteration gives a Container. However I'm seeking feedback on the best API for such a change.

I'd rather not encourage use of this scenario!

Nor I, but such an API would be symmetric with the extant Ldap::searchEntries() method.

@heiglandreas
Copy link
Member

So the PageIterator would iterate over the pages while the ContentIterator(or however it is called) would then Iterate over the single entries. That would mean to encapsulate the current Zend\Ldap\Collection\DefaultIterator into a further Iterator.

Sounds like it's doable. The interesting part will be to keep the current page over HTTP-Requests. I'll have to think about that....

@ThaDafinser
Copy link

ThaDafinser commented Feb 23, 2017

I'm having also a use case where i need to iterate over all results - plus i need to grab also full "ranged" attributes.

For this i created those 2 test files for now - they do work..but yeah the code is a bit...
https://gist.github.com/ThaDafinser/6d8ea903357fb31bf23eac86f82b209c

@ThaDafinser
Copy link

ThaDafinser commented Feb 23, 2017

@bishopb @heiglandreas i ported now my code above to an iterator, which should already work quiet well.

If you like it or have feedback, please comment

<?php
namespace EsSyncAd;

use Iterator;
use Zend\Ldap\Ldap;
use Zend\Ldap\Exception;
use Zend\Ldap\ErrorHandler;

final class PagingIterator implements Iterator
{

    private $ldap;

    private $filter;

    private $baseDn;

    private $returnAttributes;

    private $pageSize;

    private $entries;

    private $current;

    /**
     * Required for paging
     *
     * @var unknown
     */
    private $currentResult;

    /**
     * Required for paging
     *
     * @var unknown
     */
    private $cookie = true;

    public function __construct(Ldap $ldap, string $filter, string $baseDn = null, array $returnAttributes = null, $pageSize = 250)
    {
        $this->ldap = $ldap;
        $this->filter = $filter;
        $this->baseDn = $baseDn;
        $this->returnAttributes = $returnAttributes;
        $this->pageSize = $pageSize;
    }

    private function getLdap()
    {
        return $this->ldap;
    }

    private function getFilter()
    {
        return $this->filter;
    }

    private function getBaseDn()
    {
        return $this->baseDn;
    }

    private function getReturnAttributes()
    {
        return $this->returnAttributes;
    }

    private function getPageSize()
    {
        return $this->pageSize;
    }

    private function fetchPagedResult()
    {
        if ($this->cookie === null || $this->cookie === '') {
            return false;
        }
        
        if ($this->cookie === true) {
            // First fetch!
            $this->cookie = '';
        }
        
        $ldap = $this->getLdap();
        $resource = $ldap->getResource();
        
        ldap_control_paged_result($resource, $this->getPageSize(), true, $this->cookie);
        if ($this->getReturnAttributes() !== null) {
            $resultResource = ldap_search($resource, $ldap->getBaseDn(), $this->getFilter(), $this->getReturnAttributes());
        } else {
            $resultResource = ldap_search($resource, $ldap->getBaseDn(), $this->getFilter());
        }
        if (! is_resource($resultResource)) {
            /*
             * @TODO better exception msg
             */
            throw new \Exception('ldap_search returned something wrong...' . ldap_error($resource));
        }
        
        $entries = ldap_get_entries($resource, $resultResource);
        
        ErrorHandler::start();
        $response = ldap_control_paged_result_response($resource, $resultResource, $this->cookie);
        ErrorHandler::stop();
        
        if ($response !== true) {
            $errNo = ldap_errno($resource);
            
            /*
             * @TODO CHECK HERE AGAIN!!!
             */
            var_dump($errNo);
            exit();
            
            if ($errNo === 10) {
                return false;
            }
            
            /*
             * @TODO better exception msg
             */
            throw new \Exception('Paged result was empty' . ldap_error($resource));
        }
        
        unset($entries['count']);
        
        if ($this->entries === null) {
            $this->entries = [];
        }
        
        $this->entries = array_merge($this->entries, $entries);
        
        return true;
    }

    public function current()
    {
        if (! is_array($this->current)) {
            $this->rewind();
        }
        if (! is_array($this->current)) {
            return;
        }
        
        return $this->current;
    }

    public function key()
    {
        if (! is_array($this->current)) {
            $this->rewind();
        }
        if (! is_array($this->current)) {
            return;
        }
        
        return $this->current['dn'];
    }

    public function next()
    {
        // initial
        if ($this->entries === null) {
            $this->fetchPagedResult();
        }
        
        next($this->entries);
        
        $this->current = current($this->entries);
    }

    public function rewind()
    {
        // initial
        if ($this->entries === null) {
            $this->fetchPagedResult();
        }
        
        reset($this->entries);
        $this->current = current($this->entries);
    }

    public function valid()
    {
        if (is_array($this->current)) {
            return true;
        }
        
        return $this->fetchPagedResult();
    }
}

@heiglandreas
Copy link
Member

Awesome! I'll have a detailed look later today!

@ThaDafinser
Copy link

ThaDafinser commented Feb 23, 2017

A new version is available here (combined with range attribute resolving)
Also the result is a bit converted (count removed and so on)
https://gist.github.com/ThaDafinser/1d081bed8e5e6505e97bedf5863a187c

@phisch
Copy link

phisch commented Jun 22, 2017

@heiglandreas Have you had the time to look at @ThaDafinsers iterator implementation?

@heiglandreas
Copy link
Member

The implementation looks sensible… I haven't tested it and I probably would have build it differently but hey, that's PHP ;)

Note that we'd have to iterate over the complete iterator once to be able to sort the items in-memory afterwards if sorting should be of interest.

For fetching all the possible items from an LDAP it's great!

@ThaDafinser: Do you want to open a PR?

@lorenzomar
Copy link

Hi all, any news about implementing this feature?

What to change the behavior of DefaultIterator to allow access to resultId?

@madmatt
Copy link

madmatt commented May 30, 2018

I am also keen to see this work completed, is there any plan for this? If not, I may be doing this myself - if so I will try and tidy it up into a PR.

Thanks!

@equinoxmatt
Copy link

Any update on getting the iterator (@madmatt ) merged in? This is a great solution.

@madmatt
Copy link

madmatt commented Jul 30, 2019

I didn't end up having time to get it raised as a PR against zend-ldap, I ended up adding it to our open-source module directly (silverstripe/ldap), with some adjustments.

For reference, in case it's useful for others trying to work out how to use the iterator:

Huge credit to @ThaDafinser for creating the iterator above, it's fantastic!

@weierophinney
Copy link
Member

This repository has been closed and moved to laminas/laminas-ldap; a new issue has been opened at laminas/laminas-ldap#5.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants