Web Scraping a Premium Sneaker Boutique’s Shopify Store

J Yue
3 min readMay 1, 2021

I have been curious about web scraping for a while now and decided to move forward with a tutorial from John Rooney.

In John’s tutorial he uses Helm Footwear. For my own purposes I decided to use renowned premium sneaker boutique Extra Butter NY.

The Method

Information was scraped on April 15, 2021.

Products Used For This Blog Post

  • Google Colaboratory
  • Google Sheets
#Exporting Extra Butter Productsimport requestsimport jsonimport pandas as pdurl = 'https://extrabutterny.com/products.json'r = requests.get(url)data = r.json()product_list = []for item in data['products']:title = item['title']handle = item['handle']created = item['created_at']product_type = item['product_type']vendor = item['vendor']#print(title,handle,created,product_type)for image in item ['images']:try:imagesrc = (image['src'])except:image='None'for variant in item['variants']:price = variant['price']sku = variant['sku']available = variant['available']product = {'vendor': vendor,'title': title,'handle': handle,'created': created,'product_type': product_type,'price': price,'sku': sku,'available': available,'image': imagesrc}product_list.append(product)# print(product_list)df = pd.DataFrame(product_list)#please note: this is not the full path - name yours however you see fit
df.to_csv('/content/drive/MyDrive/Colab Notebooks/...')
print('saved to file.')

Differences in Code

I added a line to include types of products in my CSV. Whereas Helm sells shoes, Extra Butter NY sells a variety of products which I was interested in reviewing.

'product_type': product_type,

What I Learned About the Code

Indents Are Important

When I first tried this tutorial out, I ran into a few false starts because of the way I had placed my indents. It’s important to pay attention to indent formatting as they affect how the code gets processed. Please note the indents aren’t reflected in the code block above — they are reflected in my notebook. However, I encourage everyone to watch to tutorial for reference.

What I Learned About Extra Butter

The main point of this exercise was to

  • Begin to understand webs craping
  • Review and report on items that were no longer in stock based on the data pulled

Observations

  • Extra Butter carries 36 brands including their own in-house brand
  • The total number of items scraped on April 15 was 1777
  • They currently have a test product up titled “karin

Learnings

  • 683 products were sold out out of 1777 listings
  • 546 of these products were Footwear items
  • 64% (442) of all products sold came from 10 brands

10 best selling brands include:

  • Air Jordan (136)
  • Adidas (120)
  • Nike (81)
  • Asics (50)
  • Converse (36)
  • Reebok (34)
  • New Era (33)
  • Nike SB (23)
  • The North Face (20)
  • Needles (15)
  • Karhu (14)

The number in parentheses is the number of units sold out per brand on the day this was pulled.

Insights

  • The best selling item was the Asics Womens Gel-Lyte III OG Shoes (32 units sold out) — this accounted for 5% of the sold out items, but 64% of sold items for this brand. Despite Asics being the fourth sold out brand, this particular item was coveted enough to
  • The best selling Air Jordan item was the Air Jordan x CLOT Mens 14 Low SP Shoes (18 units). This was a special release in conjunction with the Lunar New Year so it was likely sold out for a while. It was also a limited run so the 18 units that were probably the entire inventory for this style.
  • 5 out of the top 10 sold out items were Adidas. This was due to their line of celebrity and pop culture collaborations. These sold out items include the highly coveted Yeezys, to the South Park, Bad Bunny and Pharrell Williams collaborations.

I’ll be revisiting Extra Butter NY’s #’s on April 22 to see what changed.

--

--

J Yue

Digital strategist turned aspiring data analyst. Running enthusiast. I find coding tutorials, do them with my own twist and write about them.