I’m still on a mission this year to switch to Replit as my full-time IDE.
Unfortunately, I’m noticing some limitation. For example I can’t use Puppeteer on NodeJS (which is a shame because I’m taking a scraping course on Udemy and can’t follow it because of that limitation).
Scraping works fine with something like axios. But since Puppeteer relies on opening a browser window, it gets blocked.
Or is there a way around this? Am I missing something? I know some people here use Replit as their full-time IDE, so I wanted to ask them.
/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/BrowserRunner.js:300
reject(new Error([
^
Error: Failed to launch the browser process!
/home/runner/.cache/puppeteer/chrome/linux-1095492/chrome-linux/chrome: error while loading shared libraries: libgobject-2.0.so.0: cannot open shared object file: No such file or directory
TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md
at onClose (/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/BrowserRunner.js:300:20)
at Interface.<anonymous> (/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/BrowserRunner.js:288:24)
at Interface.emit (node:events:525:35)
at Interface.emit (node:domain:489:12)
at Interface.close (node:internal/readline/interface:536:10)
at Socket.onend (node:internal/readline/interface:262:10)
at Socket.emit (node:events:525:35)
at Socket.emit (node:domain:489:12)
at endReadableNT (node:internal/streams/readable:1359:12)
Node.js v18.12.1
/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/ProductLauncher.js:127
throw new Error(`Could not find Chromium (rev. ${this.puppeteer.browserRevision}). This can occur if either\n` +
^
Error: Could not find Chromium (rev. 1095492). This can occur if either
1. you did not perform an installation before running the script (e.g. `npm install`) or
2. your cache path is incorrectly configured (which is: /home/runner/.cache/puppeteer).
For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.
at ChromeLauncher.resolveExecutablePath (/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/ProductLauncher.js:127:27)
at ChromeLauncher.executablePath (/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/ChromeLauncher.js:206:25)
at ChromeLauncher.launch (/home/runner/random/node_modules/puppeteer-core/lib/cjs/puppeteer/node/ChromeLauncher.js:93:37)
Node.js v18.12.1
Any other solutions? Funny part is when I click on that url in the error is takes me to a 404 page. So no luck.
Thanks for letting me know about those packages though. Now I’ll know to remove them each time.
import puppeteer from 'puppeteer-core';
import { exec } from "child_process";
function findChromium() {
return new Promise((res, rej) => {
exec("nix eval nixpkgs.chromium.outPath --raw", (error, stdout, stderr) => {
if (error) rej(error.message);
else if (stderr) rej(stderr);
else res(`${stdout}/bin/chromium`);
});
});
}
const chromiumPath = await findChromium();
const browser = await puppeteer.launch({
headless: false,
executablePath: chromiumPath,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
await browser.close();
Once I start this, it opens the browser and just stops there at about:blank. I added in some console.logs and it doesn’t look like it’s getting past the puppeteer.launch call
Notice that I’m using puppeteer-core instead of puppeteer as the chrome executable is downloaded with Nix
WARNING: FOr simplicity, I disabled the sandbox. As the docs say, only do this if you absolutely trust the websites you are scraping!
Regarding #1, I did install chromium using the Replit “Packages” tool (I only use that now instead of npm or yarn… that’s ok right?)
Regarding #2, how do I reconfigure my cache path? I can’t find it.
Great! I might be wrong but it looks you solved it by replacing chromium with chromium-browser
I have very limited knowledge with Nix, and I may very well be wrong, but I think the executable path changes every time your Repl boots and installs the nix packages. I know there’s a way to get Nix to store this path in an environment variable, but I wasn’t able to find it… (That’s why I did the thing where it runs the nix command to find it)
Other than that, thanks for sharing your solution!
EDIT: Huh, I tried your code out and it seems like it only works when I hardcode the executable path! I think it has something to do with how nix eval nixpkgs.chromium.outPath outputted the path to Chrome 92, while you have Chrome 108 there. It also seems like it’s perfectly fine to hardcode the path. Thanks again!