Magic URL Data Extractor a Cross Domain Access Plugin

Magic URL extractor is a cross-domain data extractor plugin, we are busy in developing a optimize plugin for various purpose. Till then use this plugin to extract cross domain commerce product details with URL (at this time plugin in beta version so only support top e-commerce sites in India).

The concept behind this script like as Seenit (Indian fashion social discussion like the app).  It's also most awaiting Cross-Domain Access Plugin to access data without any API or XML permission.

Magic URL Data Extractor Work Phenomena

As we know well browser didn't support cross-domain request for accessing external data without JSONP, so our concept is simply based on such logic. When you proceed search button this script create a server request to load URL data in encrypted mode and while processing converts HTML data in XML/JSON format.

It's not easy to handle complete external DOM data without affecting your server load time so we create a virtual DOM and access data with a variable.

url-extractor-compressor

Our next step to filter data for Ajax, CSS, and Javascript request, after filtering all those things we process with converting in HTML format and store in a local container. We again process request to find exact data from the container and after getting all required result we display actual product only data.

URL --> Proxymapping --> CrossDomain request --> Request and save html data in Json--> Filter data to avoid concole erros and DOM optimization --> Proceess various data handling --> Show the result

Cross Domain Access Plugin, Handling DOM and Retrieving Data :

It's a deal to get cross-domain HTML data and convert into JSON format using YQL query, as below:

 $.getJSON("http://query.yahooapis.com/v1/public/yql?"+
 "q=select%20*%20from%20html%20where%20url%3D%22"+
 encodeURIComponent(url)+
 "%22&format=xml'&callback=?",

After retrieving the data surely we will filter unnecessary JS and CSS components.

 data = data.replace(/<?\/body[^>]*>/g,'');
 data = data.replace(/[\r|\n]+/g,'');
 data = data.replace(/<--[\S\s]*?-->/g,'');
 data = data.replace(/<noscript[^>]*>[\S\s]*?<\/noscript>/g,'');
 data = data.replace(/<script[^>]*>[\S\s]*?<\/script>/g,'');
 data = data.replace(/<script.*\/>/,'');

We all know these things become laggy so and your DOM got heavy, yes exactly but for while. Script release Virtual DOM data as soon you got a final response.


GitHub Repo Demo Download Source Code
Hope you will love this script. let me know your views.

* Do you like this story? Then why not share it with your Friends *
If you enjoyed this post and wish to be informed whenever a new post is published, then make sure you subscribe to our regular Email Updates!

Himanshu is a young engineer living in India. Currently working at LiveCareer as a software engineer. He is an ethical hacker & blogger too, doing lots of crazy stuff... If you seem interesting, go through his portfolio: www.himstar.info : "Open Source. Millions of open minds can't be wrong!"

Leave a reply:

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Site Footer

Sliding Sidebar

We are India’s largest Startup Community


We are team of ' Delhi Startups ' , most active startup community with strict spam policy.
We are making !deas happen..for future, business and jobs without charging anything, with connecting entrepreneurs.. It's a reason to trust on us.
Come and join or subscribe, we will defiantly give a reason to like us.

Our Facebook Page