• We're half back! There's a lot missing, but you can find out more here,

    You are now able to log into the forums and post

A new way to search and navigate TGSC

jameshillier

Basenotes Plus
Basenotes Plus
Jul 15, 2020
316
72
TGSC is an invaluable resource, but has always been slow to search and is hard to navigate.
Recently with the search function going missing, I thought I'd share a little helper script I put together a while ago and have been polishing up today.



phenylethylalcohol.gif



How it works
The script and required data is all embedded in a single HTML file.
The Good Scents Company pages are loaded in an iframe below the search input.
When you search, the script gives you a list of results from the locally stored index.
When you select a result, the script navigates the iframe (the embedded webpage) to the selected TGSC page.

How to use it
  1. Download the attached zip file and unzip the contained .html file somewhere handy.
  2. Open the HTML file in your browser. Double-click or right click it and choose Open With > Edge/Firefox/Chrome/Safari
  3. Type a search - words and numbers written in any order will work. Also search by CAS or EINECS.
  4. Search results are ordered by length, attempting to surface the closest names to your search.
  5. Use up and down arrows on your keyboard to select a search result and hit enter (or just click a search result) to navigate to the desired page.
  6. Press down arrow to show the same results again.

Benefits
  • FAST results
  • No installation
  • No hosting required
  • Cross-platform
  • One-time download
  • Uses the official, live TGSC website
  • No need to jump between Google and TGSC
Requirements
  • Modern web browser
  • Internet connection required for jQuery and Underscore libraries to download
  • Internet connection required to navigate TGSC pages
Limitations
  • Known synonym inaccuracies are reflected here (amberXtreme).
  • Only shows top 50 results, so be more specific if your desired material doesn't show up immediately.
  • Not optimised for use on mobiles or other narrow screens. Designed for desktop use.
  • Limited to fragrance materials - does not include solvents or cosmetics materials at this stage.
  • June 2022 dataset, though I'm not sure there's many updates to TGSC these days anyway.
Good to know
  • The index takes in all Names, Synonyms and Product names from the "Suppliers" section, so there's a good chance you'll find what you're looking for.
  • The script is offered as-is, and I believe it should be good enough for now. However, do reach out on this thread if you find bugs or have usability improvements or enhancement ideas etc.
  • Zipped size: 770KB, Unzipped size: 3.6MB
  • I'm just a hobbyist perfumery enthusiast and not associated with TGSC in any way.
  • I don't mind if anyone takes this, copies or hosts this somewhere or modifies it and makes it better.


Happy to answer any questions you may have.
Take it, use it and enjoy!
 

Attachments

  • AllFragrance-embedded-v1.zip
    751.9 KB · Views: 54
Last edited:

PeeWee678

Well-known member
Jan 7, 2022
445
273
Thanks for this. Not sure yet if I will use it often (I always Google for tgsc [materialname]) but it may come in handy.
I put it in a folder and dragged it to the bookmark bar in Chrome.
 

jfrater

Basenotes Plus
Basenotes Plus
Jun 2, 2005
3,073
1,950
This is also probably a good time to ask if anyone has a complete downloadable scrape of the site - I believe such a thing exists somewhere and there really needs to be an archive kept just in case. Emails go unanswered and the site is now really just being maintained by the widow of the creator I believe.
 

ourmess

Well-known member
Apr 25, 2018
1,061
670
I did a full site scrape about a year ago, but it was purely for my own use. I hadn't intended to share it and I'm unsure how the current site's management would feel about the idea.

If the site goes into the void, then it may be a different story.
 

Alex F.

Well-known member
Nov 29, 2019
1,024
1,677
I did a full site scrape about a year ago, but it was purely for my own use. I hadn't intended to share it and I'm unsure how the current site's management would feel about the idea.

If the site goes into the void, then it may be a different story.
What size can one expect, are we talking MBs, GBs?
 

julian35

Basenotes Plus
Basenotes Plus
Feb 28, 2009
1,304
170
Last full-scrape I did was July 2022 (1.84 GB on disk) for 68,792 items

EDIT:
June 13, 2023 scrape on high speed fiber 6 days !!!!
7 Layers (include supporting files) = (3.51 GB on disk) for 44,435 items ((159541 failed with error code 404 (page not found) and error code -1003 linked hostname not found)
 
Last edited:

julian35

Basenotes Plus
Basenotes Plus
Feb 28, 2009
1,304
170
TGSC is an invaluable resource, but has always been slow to search and is hard to navigate.
Recently with the search function going missing, I thought I'd share a little helper script I put together a while ago and have been polishing up today.
This is brilliant James. Thank you!
 

shackener

New member
Feb 11, 2023
14
18
This is excellent work, thank you! A great enhancement, IMO, could be if the results of the search dropdown box thingy would be mmb clickable (default for "open in new tab"). It's already much more faster to work with this than regular tgsc, but being able to rapid-fire different tgsc tabs would be god-like. Let me know! Subscribed 😁
 

Zongo

Member
Mar 11, 2023
63
8
@jameshillier Good idea. Can you optimize, that, if scrolled down from the list of proposals by the arrow keys, the scrollbar follows the scroll down action? (or is it my Browser, Brave, the culprit?)
 

PeeWee678

Well-known member
Jan 7, 2022
445
273
or is it my Browser, Brave, the culprit?
No, it's the same in Chrome.

BTW: I just scraped TGSC with HTTrack. I only got about 36k files (around 1.64 GB) though. Not sure yet what's missing.
With two simple edits in James' file I got the whole thing working locally so there is hope in case TGSC goes offline.

Not sure why I only got 36k files and not 69k like Julian but I did a test with no internet connection and thus far it seems to be working fine.
James' solution is especially useful for local use since navigating via the alphabetic links is a bit awkward (and no Google of course).

Thanks again James!
 

Culpa Ire

Active member
Nov 11, 2022
202
228
Like a few others, I did a scrape a few days ago when the search function failed to work. The folder is 1.94 GB and there are 69,810 items. 216 errors were logged, which is a lot more than the last time I did this.

Looking forward to using James' search script; looks really useful. Thank you very much for your efforts, James.
 

mnitabach

Basenotes Plus
Basenotes Plus
Nov 13, 2020
4,470
2,187
Many thanks to those of you who are trying to scrape & archive TGSC for insurance purposes. To lose it would be an enormous tragedy & practical impediment to perfumery.
 

PeeWee678

Well-known member
Jan 7, 2022
445
273
Like a few others, I did a scrape a few days ago when the search function failed to work. The folder is 1.94 GB and there are 69,810 items. 216 errors were logged, which is a lot more than the last time I did this.

Looking forward to using James' search script; looks really useful. Thank you very much for your efforts, James.
I only had to change the URL in the following lines in James' script to make it work locally:
(below changes depend on your local installation of course)

Code:
I changed line 156
<iframe width="1440px" height="1440px" id="iframe" src="http://www.thegoodscentscompany.com/fragonly-a.html"></iframe>
to:
<iframe width="1440px" height="1440px" id="iframe" src="file:///F:/TGSC/TGSC/www.thegoodscentscompany.com/fragonly-a.html"></iframe>

and changed line 7841:
$('#iframe').attr('src', `http://thegoodscentscompany.com/data/${ref}.html`)
to:
$('#iframe').attr('src', `file:///F:/TGSC/TGSC/www.thegoodscentscompany.com/data/${ref}.html`)

Just replace the original URL's by the ones that point to your scraped website.
Hope this helps.
 
Last edited:

Culpa Ire

Active member
Nov 11, 2022
202
228
Thanks for this info, Peewee. However, I'm not really au fait with this sort of thing so where would I find that code? I know the path details for where the file is stored but I'd be guessing for the above query. Assuming it's something to do with the inspector or something?

Figured it would be better to do it here rather than a DM in case anyone else wants to know, too.
 

Zongo

Member
Mar 11, 2023
63
8
@Culpa Ire You swap "http://" for "file:///" in two occurences in the AllFragrance-embedded-v1.html file. The path given with "file:///..." must point to your local data, so take care to adapt it to your directory, where HTTrack downloaded the website to.

P.S. my answer was maybe a bit too simplified, as you have actually have to take care that 'www.' is in here
Code:
$('#iframe').attr('src', `file:///F:/TGSC/TGSC/www.thegoodscentscompany.com/data/${ref}.html`)
... which wasn't in there before.
 
Last edited:

Forum statistics

Threads
267,163
Messages
5,067,881
Members
205,467
Latest member
Ocasiojoy
Top