crwlr.software's Avatar

crwlr.software

@crwlr

crwlr.software is a collection of open source PHP composer packages for web crawling and scraping.

36
Followers
152
Following
14
Posts
21.11.2024
Joined
Posts Following

Latest posts by crwlr.software @crwlr

Post image

😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈😈

15.10.2025 12:39 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Post image

Made it to 10,000 installs of the crwlr/crawler package! πŸ₯³πŸ’ͺ🏻
Here’s to the next 10,000! πŸ₯‚

31.07.2025 17:58 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Reached some install milestones for the crwlr packages πŸŽ‰
- url passed 50,000 installs
- query-string is at an unbelievable 3,000,000 installs (thanks to bref!)
- robots-txt is approaching 15,000
- crawler will soon reach 10,000

Thanks to everyone using and supporting the packages! 🫢

07.05.2025 08:09 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - crwlrsoft/crawler: Library for Rapid (Web) Crawler and Scraper Development Library for Rapid (Web) Crawler and Scraper Development - crwlrsoft/crawler

Made it to 350 ⭐️s on the crawler package Github repo! πŸš€πŸ’ͺ🏻🫢
github.com/crwlrsoft/cr...

28.01.2025 09:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Doesn't seem to work with this robots.txt Β· Issue #12 Β· crwlrsoft/robots-txt robots.txt ---------------------------- User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php User-agent: * Disallow: /*blackhole Disallow: /?blackhole ---------------------------- url...

Tagged a bugfix release of the robots-txt package, thx to this issue github.com/crwlrsoft/ro... by Martin-Matchory. πŸ™πŸ’ͺ🏻

27.01.2025 18:09 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Crwlr Recipes: Using a Crawler for Website Error Detection and Cache Warming Have you ever deployed your website or web app, only to discover hours later that you’ve introduced bugs or broken links? Or do you clear the cache with every deploy, leaving the first users to experi...

A new blog article is out! πŸ€“πŸ“š

Learn how to use a crawler to perform a health check and warm your cache after every deployment of your website or web app. Ensure smooth performance and error-free user experiences. Happy reading!

www.crwlr.software/blog/post-de...

20.01.2025 08:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1
Preview
symfony/src/Symfony/Component/DomCrawler/NativeCrawler/DomCrawler.php at 10667d5382b4dc2e4078d6408f92f34ffd52b580 Β· symfony/symfony The Symfony PHP framework. Contribute to symfony/symfony development by creating an account on GitHub.

Wait, I think I was wrong and there is still an XPath option! See github.com/symfony/symf...
from github.com/symfony/symf...

09.12.2024 13:51 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Ah, great insight πŸ‘ thx!

06.12.2024 20:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Yeah, maybe also because it'd be a lot more work implementing both and CSS selectors are easier and more popular.
I know that Symfony translates CSS selectors to XPath under the hood, because CSS selectors weren't available natively. Using CSS selectors instead is definitely the better choice!

06.12.2024 09:44 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
PHP: rfc:dom_additions_84

CSS Selectors πŸ™‚
wiki.php.net/rfc/dom_addi...

06.12.2024 09:37 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Ah, it seems like XPath is no longer an option in the new DOM API in PHP 8.4. I assume it won’t be a problem if we remove this feature in a new major version of the crwlr/crawler library?

05.12.2024 12:42 πŸ‘ 1 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

The new HTML5/DOM functionality in PHP 8.4 is amazing! I’m working on adding it to crwlr/crawler to leverage it when PHP 8.4 is used. This will enable features like CSS pseudo-selectors not supported by Symfony DomCrawler.

03.12.2024 11:47 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1
Preview
PHP 8.4 Released PHP 8.4 is a major update of the PHP language. It contains many new features, such as property hooks, asymmetric visibility, an updated DOM API, performance improvements, bug fixes, and general cleanu...

PHP 8.4 just dropped! πŸ₯³πŸš€πŸ’ͺ
php.net/releases/8.4...
Great release page! Well done!

Already added running tests on 8.4 in CI, so all crwlr packages should be ready for the upgrade! πŸ’ͺπŸ™‚

21.11.2024 10:45 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Hey Bluesky! πŸ‘‹
What better time to move over from the old place? So, crwlr is now here on Bluesky too! πŸ₯³πŸš€

21.11.2024 10:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0