Categories
Tech

PyWebTest Project

Couple weeks ago, I started a small project to help a friend buying air tickets. My naive imagination told me that, all I need to do is to open the airline’s website, enter some query information, and keep refreshing until a ticket present. It didn’t start out well.

Tech stack

  • Python 3
  • Selenium
  • Chrome driver
  • Chrome browser

I’m new to the web crawling/scrapping area, but I have done some testing with Selenium before. I thought I could just figure out where to click, enter flight information, search, and then keep refreshing.

The problems

Soon I realized that the airline website has a mechanism to block this kind of automation. After finished coding the functionality above, I decided to run my code. The refresh time interval was set to 30 seconds. It went well for about 20 minutes. So I padded myself on the shoulder and thought everything works perfectly and went on doing some housework.

An hour later, I came back and see a Google reCAPTCHA screen. Okay, the airline website knew that I was using a bot to refresh for a new airline ticket. I added some random sleep time on top of my 30-second refresh interval. After another 10 minutes or so, my IP address is blocked. 😶

Solution

The first thing I could think of is using a VPN. So I bought a VPN service so that I could continue my development of the bot. That helped temporarily solved the blocked IP address issue.

Next, I have to solve the Google reCAPTCHA problem, there are 2 ways to go. One is the use some sort of AI library to break it. It’s rather difficult going this route because Google reCAPTCHA is very sophisticated and hard to get by(Otherwise, they’d be out of business already😂). That left me to the second option, fake a user agent header, which is my only alternative for now. A user agent header is a piece of information in your browser send through HTTP/HTTPS request. It contains information of what browser you’re using and what operating system you’re on. The basic principle is to randomly generate a new identity of my browser and trick Google reCAPTCHA that it’s a different device. I found this perfect library called fake-useragent that does the exact thing for me, and it works!

After that, the rest is just code to enter flight information and search. Most of the code can be reused for UI testing and web crawling. So I decided to make part of it open source. Here I started the PyWebTest project.

https://github.com/lokarithm/PyWebTest

Cheers,
Lok

Categories
Tech

A Cheatsheet Of Linux File System And Structure

Although I’m not completely new to the Linux operating system, the file system on Linux is still confusing to me. I watched a video about the Linux file/structure system. It’s time to put down some notes and get familiar with them. Let’s go!

  • /bin – binaries:
    • For programs and applications.
  • /sbin – system binaries:
    • For root/admin users only.
  • /boot – contains everything a system needs to boot
    • e.g. boot loader
  • /dev – devices:
    • Everything is a file in Linux. Hardware like a disk, webcam, the keyboard will be stored here, including their drivers.
  • /etc – etcetera:
    • It stores all the system-wide configuration files.
  • /lib – libraries (includes lib, lib32 and lib64)
    • Libraries of applications
  • /mnt and /mnt – mount:
    • For other mounted drives. You’ll typically see /mnt instead of the/mdeia directory. However, most distros nowadays automatically mount devices for you in the media directory.
    • When you mount a drive manually, use the /mnt directory
    • e.g. external hard drive, USB flash drive, network drive
  • /opt – optional:
    • Manually installed software lives here. Some software packages found in the repo can also be found here.
    • You can also put your own software in this directory.
  • /proc – processes:
    • Pseudo files that contain information about system processes and resources.
    • e.g. A directory that contains information on a running process; information of the CPU, etc.
  • /root – home folder of a root user
    • A directory where only a root user has access.
  • /run:
    • A relatively new directory. It is a tempfs file system. It runs in RAM. Anything in this directory will be deleted after rebooting the system. It is used for processes that start early in the boot procedures to store runtime information.
  • /snap:
    • It stores snap packages(mainly used for Ubuntu). Snap packages are self-contained applications that run differently from other applications.
  • /srv – server:
    • It stores server data. Files that will be accessed by external users. e.g. you set up an FTP server. Files can go here, which is separated from the other files for security purposes.
  • /sys – system:
    • A way to interact with the kernel. This is also a temporary directory created every time the system boot up. Similar to the /run directory.
  • /tmp – temporary:
    • Files that temporarily stored for applications. It’s usually empty after you reboot a system.
    • e.g. temporary files of a word processor application.
  • /usr – user application:
    • Applications installed by a user or for a user only. Applications installed in this directory considered non-essential for basic system operation.
    • Under the /usr directory, there are folders such as lib, bin, and sbin.
    • /usr/local contains software installed from the source code.
    • /usr/shared contains larger software.
    • /usr/src contains installed source code such as kernel header files.
    • Different software or distros may treat these folders differently.
  • /var – variable:
    • Files and directories that are expected to grow in size.
    • e.g. /var/crash contains information about files that are crashed; /var/log contains log files of many applications.
  • /home:
    • Each user has its own /home folder.
    • Storage of your personal files and documents.
    • Each user can only access their own /home folder unless they use admin permissions.
    • It has some hidden directories that start with a dot. e.g. .cache, .config. These hidden folders are used by different applications for their settings. You can see them by using the ls -a command in the terminal.
    • You can back up the hidden directories and restore them in a new system. After reinstalling your applications, the settings will be restored.

Credit/Source of information: DorianDotSlash

Categories
Productivity

How To Study Less, Study Smart

I came across a video recently about how to study smarter. I found it useful for people at any age that needed to focus and learn. Here’s a note I took that summarizes it.

1. Break your study sessions into 25-minute chunks

Your attention is limited. Study shows that most people can only focus on a task for about 25 minutes. Then your efficiency will deplete very quickly.
When you reach a plateau where you cannot focus anymore, take a 5-minute break. That will recharge your brain and give you a fresh start.
You can also reward yourself after finishing your entire day. This way you’ll enjoy studying more in the future.

2. Create a dedicated study area

You have trained yourself to behave differently in different rooms or areas. For example, you’ll eat at the dining table, sleep in the bedroom, etc. If you happened to eat, study, and play at the same desk, your brain will be confused when you need to focus. Ideally, you can have a study area just for you to study. But sometimes this is a luxury, especially when you’re living in a dorm room. The solution is to buy a study lamp. It doesn’t have to be fancy or expensive. You only turn it on when you need to focus and study. That way you’ll adapt to the atmosphere of staying concentrated when the lamp is on.

3. Study actively – once you’ve learned, test yourself actively

When it comes to studying, there are two categories of memorizing: concept and fact. A concept is something easy to remember once you truly understand it. For example, the functionality of a particular bone in the human body. A fact is something you just have to memorize. For example, the name of a particular human bone.

Our brain is good at recognizing but it’s not good at recollecting. For example, when you can’t remember a concept and you started looking at a highlighted paragraph. You would think you have remembered it. But you actually just recognized it. If you want to actually memorize it, you have to test yourself and learn actively. To distinguish whether you are recalling something or recognizing something, try explaining it without looking at the note or book. If you can do that, congratulations! You’re recalling it.

3. After class, study as soon as possible

Our brain can forget something pretty quickly. By immediately studying or summarizing what you just learned, the memory will be strengthened much better than later. The sooner you study, the easier you can retain the knowledge in the future. A five-minutes investment of your time after class will help you recollect the detail much better even for the next day.

4. The SQ3R reading method

Survey: Skim through all the headings and sub-headings of each chapter before reading into the detail.
Question: Convert the headings into questions. Ask yourself what is each paragraph is trying to answer.
Read: Read the paragraphs to find out what is it actually trying to tell you.
Recite: After reading, speak out loud with your own words about what you have just read as if you’re teaching it to someone.
Review: Review right away and review frequently. That way you’ll find it very easy to study for your tests and exams.

5. Use mnemonics to remember facts

Use acronyms. For example, use ROYGBIV to represent the colors: red orange yellow green blue indigo violet.

Try to associate numbers with syllables or related words. For example, to remember how many calories does carbohydrates contain, you can think of the number of syllables of the word car-bo-hy-drates. Then you’d know it is 4. Another example, carbohydrates starts with the word “car”. A car has four wheels. And you would remember carbohydrate has 4 calories.

If you face something hard to use the above methods. Think of a ridiculous story to connect the dots. The weirder the story, the easier to remember.

Summary:

  1. Break your studying sessions into 25-minute chunks
  2. Create a dedicated study area
  3. Study actively
  4. Use the SQ3R method to read
  5. Use mnemonics to memorize facts

Cheers,
Lok