Extracting and Submitting CAPTCHA with Calculated Value (Not Working)

I’m new to coding and trying to build a script that automatically extracts and submits CAPTCHA values on a website (I’ve hidden the actual URLs for privacy reasons, but the structure is similar to http://www.demo.com/result.php and http://demo.com/api.php). I’m using Node.js with Axios and Cheerio libraries.

Here’s what I’ve achieved so far:

  • I can successfully fetch the HTML content of the CAPTCHA page.
  • My script extracts the arithmetic expression presented as text within a specific HTML element (e.g., a table cell containing a “+” symbol).
  • It parses the expression to extract the numbers and calculates the expected CAPTCHA value.

The problem:

The issue arises when I attempt to submit the calculated value (value_captcha) along with other form data to the submission URL. The script doesn’t seem to work consistently, suggesting the actual CAPTCHA might be different each time. Here are three observations that support this:

  1. Changing CAPTCHA Value: I retrieve the CAPTCHA value using my script, but when I submit it, it doesn’t work, implying the actual CAPTCHA might have changed between retrieval and submission.
  2. Dynamic CAPTCHA: My assumption is that the CAPTCHA is dynamic and generates a new value each time the page is loaded.
  3. Time Discrepancy: There might be a slight delay between fetching the CAPTCHA and submitting the form, allowing the CAPTCHA to refresh.

I want to achieve a solution where my script can:

  • Retrieve the CAPTCHA value.
  • Automatically submit the form with the retrieved CAPTCHA value in a single, streamlined process.

Code Snippet:

JavaScript

const axios = require('axios');
const cheerio = require('cheerio');

// Function to extract and evaluate the captcha expression
function extractAndEvaluateCaptcha(html) {
  const $ = cheerio.load(html);

  // Search for the specific HTML element containing the captcha expression
  const captchaElement = $('td:contains("+")').filter(function() {
    return $(this).text().match(/^\s*\d+\s*[\+\-\*\/]\s*\d+\s*$/);
  }).first();

  if (captchaElement.length > 0) {
    const arithmeticExpression = captchaElement.text().trim();
    console.log("Captcha arithmetic expression:", arithmeticExpression);

    const numbers = arithmeticExpression.match(/\d+/g).map(Number);
    const captchaValue = numbers.reduce((a, b) => a + b, 0);
    console.log(captchaValue);

    return captchaValue; // Return the calculated value
  } else {
    console.log("Captcha element not found in the HTML response.");
    return null; // Indicate failure to find the element
  }
}

// Example usage (**This section needs modification!**)
const url = 'http://www.demo.com/result.php'; // Replace with actual URL

// **HERE's where we need to combine fetching and submitting:**
axios.get(url)
  .then(response => {
    if (response && response.data) {
      const extractedValue = extractAndEvaluateCaptcha(response.data);
      if (extractedValue) {
        // **Submit form data with extracted value (implementation details needed)**
        const formData = {
          user_type: 2,
          name: 'robert',
          year: 2022,
          value_captcha: extractedValue,
          button2: 'Submit'
        };

        // **Make a POST request with formData (consider using Axios.post)**
        // ... submit form data with extractedValue

      } else {
        console.error("Failed to extract CAPTCHA value.");
      }
    } else {
      console.error("No response data received.");
    }
  })
  .catch(error => {
    console.error("Failed to fetch HTML:", error);
  });

cURL Example:

Bash

curl 'http://demo.com/api.php' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8' \
  -H 'Accept-Language: en-US,en;q=0.5' \
  -H 'Cache-Control: max-age=0' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -H 'Cookie: hidden' \
  -H 'Origin: http://demo.com/' \
  -H 'Referer: http://demo.com/' \
  -H 'Sec-GPC: 1' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36' \
  --data-raw 'user_type=2&name=robert&year=2022&value_captcha=15&button2=Submit' \
  --insecure

Question:

How can I modify my script to achieve automatic retrieval and submission of the CAPTCHA value in a single process? Are there any ethical considerations or limitations to keep in mind when dealing with dynamic CAPTCHAs?

I look forward to any insights and suggestions from the Replit community!

This feels like circumventing the entire purpose of CAPTCHAs, since they’re intended to stop bot users, and is why they change.

1 Like

Yes i know that, but they are not the same captcha like image, there are two number to sum like 2+2. but i still couldnt get any way to solve the problem as am new learner, can you please help me. :frowning:

1 Like

I’m afraid I don’t do a lot in the field of web bots, so I don’t know what you would do differently here.

Also, personally, regardless of what kind of CAPTCHA it is, I would still not recommend automating solving them.

1 Like

I could send you the website address if it helps. It’s basically a site for searching stuff, and I really need some help with it. Would you be able to take a look? I’d be super grateful!

1 Like

image
It’s a pure HTML text. I have the method for getting the text with the calculation, but when posting it through the API, I’m not able to get the expected result.

1 Like

As I said, I’m not experienced with web bots, I wouldn’t know where to start.

1 Like

Please dont try and circumvent restrictions a website has.