SEO Tip : robots.txt 설정 방법

디지털 & IT 상식

SEO Tip : robots.txt 설정 방법

트렌드랩 2024. 2. 14. 12:53

728x90

SEO Tip : robots.txt 설정 방법, 크롤링 제한 / 허용, 검색로봇엔진별 설정
(사이트맵 주소 설정까지)

모든 검색엔진들은 robot.txt 파일의 설정내용을 참고하여 검색엔진의 크롤링봇이 웹사이트 정보들을 수집하게 되어 있습니다. 이같은 표준은 누가 정했는지는 저도 잘...모르지만요

그래서 오늘은 robots.txt 파일안에 작성할 수 있는 내용들과 관리 방법을 알려드리려구요.

일단 아래 문구를 살펴볼까요?

User-agent: Googlebot
Disallow: /nogooglebot/

User-agent: *
Allow: /

Sitemap: https://www.example.com/sitemap.xml

1. Googlebot 즉, 구글 검색크롤링은 사이트주소/nogooglebot/ 이하 경로의 페이지에 대해서 크롤링 하지 마세요.

2. 그 외 모든 검색엔진들은 모든 경로의 페이지를 수집할 수 있습니다.

3. 사이트맵은 https://www.example.com/sitemap.xml 있습니다.

라고 해석할 수 있어요.

User-agent 는 특정 검색엔진을 지정할 수 있어요. 모든 검색엔진은 * 아스타리카 표시를 하면 됩니다.

아래 리스트에서 특정 검색엔진 에이전트를 참고해주세요.

User-Agent 리스트

검색 엔진 Search Engine	User Agent	크롤릭 목적 Purpose of Crawling
Alexa	ia_archiver	Crawler for Ranking
AOL	aolbuild	Search
Ask Jeeves	teoma	Search
Baidu	Baiduspider	Search
Baidu	Baidu Favorites	Baiduspider-favo
Baidu	Baidu Union	Baiduspider-cpro
Baidu	Business Search (Advertisements)	Baiduspider-ads
Baidu	Desktop	Baiduspider
Baidu	Image Search	Baiduspider-image
Baidu	Mobile	Baiduspider
Baidu	News Search	Baiduspider-news
Baidu	Video Search	Baiduspider-video
Bing	AdIdxBot	Bing Ads
Bing	Bingbot	Desktop and Mobile
Bing	BingPreview	Page Snapshots
Bing	MSNBot	Predecessor of Bingbot
Bing	MSNBot-Media	Images and Videos
Daum	Daumoa	Search
DuckDuckGo	DuckDuckBot	Search
Google	AdsBot-Google	Landing Page Quality Check
Google	AdsBot-Google-Mobile-Apps	App Crawler
Google	Googlebot	Desktop
Google	Googlebot	Smartphone
Google	Googlebot-Image	Images
Google	Googlebot-News	News
Google	Googlebot-Video	Videos
Google	Mediapartners-Google	AdSense Desktop
Google	Mediapartners-Google	AdSense Mobile
MSN	msnbot	Search
Naver	Yeti	Search
Teoma	teoma	Search
Yahoo!	Slurp	All Search
Yandex	YaDirectFetcher	Advertising
Yandex	Yandex	All Crawling
Yandex	YandexAntivirus	Malware Checker
Yandex	YandexBlogs	Blog Posts and Comments
Yandex	YandexBot	Desktop
Yandex	YandexCalendar	Calendar
Yandex	YandexDirect	Advertising
Yandex	YandexDirectDyn	Dynamic Banners
Yandex	YandexFavicons	Favicons
Yandex	YandexImageResizer	Mobile Image Services
Yandex	YandexImages	Images
Yandex	YandexMedia	Media
Yandex	YandexMetrika	Web Analytics
Yandex	YandexMobileBot	Mobile
Yandex	YandexNews	News
Yandex	YandexPagechecker	Micro Markup Validator
Yandex	YandexScreenshotBot	Screenshot
Yandex	YandexSitelinks	Sitelinks
Yandex	YandexVertis	Vertical Search
Yandex	YandexWebmaster	Webmaster Services

Disallow, Allow 에 대해서 말씀드리면

특정 하위 경로에 대해서 수집허용, 수집불가를 선언하는건데요.

아래의 경우에는 네이버 검색엔진에 대해서 /private-video/ , /private-image 등 private 로 시작하는 폴더 모두의 하위경로 페이지들은 수집하지 말라는 거에요. (보통은 백오피스 경로에 대해서 걸어두는게 일반적이죠)

User-agent: Yeti
Disallow: /private*/

https://trendylab.tistory.com/robots.txt

기본적으로 티스토리는 아래와 같이 robots.txt 가 생성되어 있습니다.

User-agent: *
Disallow: /guestbook
Disallow: /m/guestbook
Disallow: /manage
Disallow: /owner
Disallow: /admin
Disallow: /search
Disallow: /m/search

User-agent: bingbot
Crawl-delay: 20

네이버 서치어드바이저를 통해서 간단히 볼 수도 있어요.