linux-nerds.org

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

AI Bots aussperren

Linux

2 Beiträge 1 Kommentatoren 202 Aufrufe

F Offline
F Offline
FrankM

schrieb am zuletzt editiert von

#1
Ich glaube, wegen @thomas@metalhead.club bin ich über dieses Projekt gestolpert. Es gibt ja private Webprojekte im Netz, wo ich diesen AI-Bot-Besuch nicht so gerne möchte. Da ich gelesen habe, das sich diese Bots nicht an die robots.txt halten, muss man sich halt was einfallen um sie zu ärgern Ich denke, das es kein 100%iger Schutz ist, aber evtl. ist es ein Anfang.

GitHub - ai-robots-txt/ai.robots.txt: A list of AI agents and robots to block.

A list of AI agents and robots to block. Contribute to ai-robots-txt/ai.robots.txt development by creating an account on GitHub.

GitHub (github.com)

Da ich Nginx nutze, benötige ich dieses File.
```
nginx-block-ai-bots.conf
```
Das muss dann noch in Nginx eingebaut werden und dann sollte es funktionieren. Mein Ansatz um das zu lösen, sieht so aus. Ich habe ein Script

ai-block.sh
```
#!/bin/bash
# Script um AI-Bots zu blocken
# https://github.com/ai-robots-txt/ai.robots.txt/tree/main

mkdir /root/AI-test
cd /root/AI-test

## Daten holen
curl -O https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/master/nginx-block-ai-bots.conf

## Daten in nginx einbauen
mv nginx-block-ai-bots.conf /etc/nginx/blocklists/

## NGINX neustarten
systemctl restart nginx.service
```
Das wird über einen crontab Eintrag jeden Tag aufgerufen.
```
0 1 * * * /root/ai-block.sh
```
Und in Nginx wird die Blocklist so geladen. Nur ein Ausschnitt aus dem Server Block.
```
## Server Block
server {
   listen       443 ssl;
   listen  [::]:443 ssl;
   server_name  <DOMAIN>;

   # AI-Bots blockieren
    include /etc/nginx/blocklists/nginx-block-ai-bots.conf;
```
Testen kann man das dann so.
```
  curl -A "ChatGPT-User" <DOMAIN>
  curl -A "Mozilla/5.0" <DOMAIN>
```
Der erste Test bringt dann einen 403
```
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx/1.22.1</center>
</body>
</html>
```
Der zweite Test ist ein erlaubter Zugriff.

Danke an @thomas@metalhead.club für den Beitrag zu o.g. Projekt.

Und zum Schluss noch die Frage "Was kann man noch machen?"

Als Ergänzung noch, ich arbeite gerne mit diesen AI's an meinen Python Projekten, aber trotzdem gibt es private Webseiten von mir, wo ich das nicht möchte.
Im Fediverse -> @FrankM@nrw.social

NanoPi R5S

Quartz64 Model B, 4GB RAM

Quartz64 Model A, 4GB RAM

RockPro64 v2.1
1 Antwort Letzte Antwort

0

FrankM

schrieb am

Wir können das noch für eine sanfte Methode erweitern, das ist die Datei robots.txt, wo man sich in alten Zeiten mal dran hielt. Einige Bots machen das, andere nicht. Praktisch, das o.g. Projekt bietet diese Datei auch an. Dann werden wir das kurz mal mit einbauen.

ai-block.sh

#!/bin/bash
# Script um AI-Bots zu blocken
# https://github.com/ai-robots-txt/ai.robots.txt/tree/main

mkdir /root/AI-test
cd /root/AI-test

## Daten holen
curl -O https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/master/nginx-block-ai-bots.conf
curl -O https://raw.githubusercontent.com/ai-robots-txt/ai.robots.txt/master/robots.txt

## Daten in nginx einbauen
mv nginx-block-ai-bots.conf /etc/nginx/blocklists/
mv robots.txt /var/www/html

## NGINX neustarten
systemctl restart nginx.service

Damit das in nginx funktioniert. Den Server Block um folgendes erweitern.

# Serve robots.txt directly from Nginx
location = /robots.txt {
    root /var/www/html;
    try_files $uri =404;
}

Kurzer Test

https://<DOMAIN>/robots.txt

Ergebnis

User-agent: AI2Bot
User-agent: Ai2Bot-Dolma
User-agent: Amazonbot
User-agent: anthropic-ai
User-agent: Applebot
User-agent: Applebot-Extended
User-agent: Brightbot 1.0
User-agent: Bytespider
User-agent: CCBot
User-agent: ChatGPT-User
User-agent: Claude-Web
User-agent: ClaudeBot
User-agent: cohere-ai
User-agent: cohere-training-data-crawler
User-agent: Crawlspace
User-agent: Diffbot
User-agent: DuckAssistBot
User-agent: FacebookBot
User-agent: FriendlyCrawler
User-agent: Google-Extended
User-agent: GoogleOther
User-agent: GoogleOther-Image
User-agent: GoogleOther-Video
User-agent: GPTBot
User-agent: iaskspider/2.0
User-agent: ICC-Crawler
User-agent: ImagesiftBot
User-agent: img2dataset
User-agent: imgproxy
User-agent: ISSCyberRiskCrawler
User-agent: Kangaroo Bot
User-agent: Meta-ExternalAgent
User-agent: Meta-ExternalFetcher
User-agent: OAI-SearchBot
User-agent: omgili
User-agent: omgilibot
User-agent: PanguBot
User-agent: Perplexity-User
User-agent: PerplexityBot
User-agent: PetalBot
User-agent: Scrapy
User-agent: SemrushBot-OCOB
User-agent: SemrushBot-SWA
User-agent: Sidetrade indexer bot
User-agent: Timpibot
User-agent: VelenPublicWebCrawler
User-agent: Webzio-Extended
User-agent: YouBot
Disallow: /

Anmelden zum Antworten

D

Es ist, wie es immer ist: Ich bin eine 3/4 Stunde zu früh fertig mit dem Aufbau.
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Uncategorized infoveranstaltu windows10 supportende linux debian
2

1

0 Stimmen

2 Beiträge

1 Aufrufe

F

@dufthummel Ich drücke Euch die Daumen, das ihr viel Publikum habt.Finde solche Veranstaltungen sehr wichtig, irgendwas in @Duesseldorf wo man mitmachen kann?
N

Did you know?
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Uncategorized free linux
1

0 Stimmen

1 Beiträge

17 Aufrufe

Niemand hat geantwortet
F

Update 1.32.5 - Security Fixes!
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Vaultwarden vaultwarden linux
1

0 Stimmen

1 Beiträge

139 Aufrufe

Niemand hat geantwortet
F

Manjaro - KDE Plasma 6
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Linux manjaro linux plasma6 kde
3

3

0 Stimmen

3 Beiträge

902 Aufrufe

F

Da fällt mir heute beim Lesen dieses Beitrages auf das ich damals ja auf unstable gestellt habe. [frank-manjaro ~]# pacman-mirrors --get-branch unstable Anleitung dazu -> https://wiki.manjaro.org/index.php/Switching_Branches Ok, da könnte ja auch mal was schief gehen? Da ich hier aber ein btrfs Filesystem fahre und Timeshift Snapshots anlegt, sollte das Risiko überschaubar sein. [image: 1714893983029-567442e5-80f0-4ce9-9b91-3e8f9a4a94d8-grafik.png] Es werden bei jeder Aktion vorher Snapshots angelegt, auf die man im Grub Menü zugreifen kann und diese wieder installieren lassen kann. Hatte das früher schon mal getestet, ging wirklich gut. Werde ich die Tage auch hier auf dem System, zur Sicherheit, mal testen. Fazit, ich lasse das mal so wie es ist
F

ZFS - Wichtige Befehle
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Linux zfs linux
3

0 Stimmen

3 Beiträge

1k Aufrufe

F

Heute mal drüber gestolpert, das es auch so was geben kann. root@pve2:~# zpool status pool: pool_NAS state: ONLINE status: Some supported and requested features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: scrub repaired 0B in 00:20:50 with 0 errors on Sun Apr 13 00:44:51 2025 config: NAME STATE READ WRITE CKSUM pool_NAS ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ata-WDC_WDS100T1R0A-68A4W0_230520800733 ONLINE 0 0 0 ata-WDC_WDS100T1R0A-68A4W0_230520801376 ONLINE 0 0 0 errors: No known data errors Was machen? Als erstes mal ein Backup angestoßen. Danach root@pve2:~# zpool get all pool_NAS | grep feature pool_NAS feature@async_destroy enabled local pool_NAS feature@empty_bpobj active local pool_NAS feature@lz4_compress active local pool_NAS feature@multi_vdev_crash_dump enabled local pool_NAS feature@spacemap_histogram active local pool_NAS feature@enabled_txg active local pool_NAS feature@hole_birth active local pool_NAS feature@extensible_dataset active local pool_NAS feature@embedded_data active local pool_NAS feature@bookmarks enabled local pool_NAS feature@filesystem_limits enabled local pool_NAS feature@large_blocks enabled local pool_NAS feature@large_dnode enabled local pool_NAS feature@sha512 enabled local pool_NAS feature@skein enabled local pool_NAS feature@edonr enabled local pool_NAS feature@userobj_accounting active local pool_NAS feature@encryption enabled local pool_NAS feature@project_quota active local pool_NAS feature@device_removal enabled local pool_NAS feature@obsolete_counts enabled local pool_NAS feature@zpool_checkpoint enabled local pool_NAS feature@spacemap_v2 active local pool_NAS feature@allocation_classes enabled local pool_NAS feature@resilver_defer enabled local pool_NAS feature@bookmark_v2 enabled local pool_NAS feature@redaction_bookmarks enabled local pool_NAS feature@redacted_datasets enabled local pool_NAS feature@bookmark_written enabled local pool_NAS feature@log_spacemap active local pool_NAS feature@livelist enabled local pool_NAS feature@device_rebuild enabled local pool_NAS feature@zstd_compress enabled local pool_NAS feature@draid enabled local pool_NAS feature@zilsaxattr disabled local pool_NAS feature@head_errlog disabled local pool_NAS feature@blake3 disabled local pool_NAS feature@block_cloning disabled local pool_NAS feature@vdev_zaps_v2 disabled local Das kommt von neuen Funktionen, die zu ZFS hinzugefügt wurden und bei Erstellung des Pools nicht vorhanden waren. Dann upgraden wir mal root@pve2:~# zpool upgrade pool_NAS This system supports ZFS pool feature flags. Enabled the following features on 'pool_NAS': zilsaxattr head_errlog blake3 block_cloning vdev_zaps_v2 Kontrolle root@pve2:~# zpool status pool: pool_NAS state: ONLINE scan: scrub repaired 0B in 00:20:50 with 0 errors on Sun Apr 13 00:44:51 2025 config: NAME STATE READ WRITE CKSUM pool_NAS ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ata-WDC_WDS100T1R0A-68A4W0_230520800733 ONLINE 0 0 0 ata-WDC_WDS100T1R0A-68A4W0_230520801376 ONLINE 0 0 0 errors: No known data errors Features kontrollieren root@pve2:~# zpool get all pool_NAS | grep feature pool_NAS feature@async_destroy enabled local pool_NAS feature@empty_bpobj active local pool_NAS feature@lz4_compress active local pool_NAS feature@multi_vdev_crash_dump enabled local pool_NAS feature@spacemap_histogram active local pool_NAS feature@enabled_txg active local pool_NAS feature@hole_birth active local pool_NAS feature@extensible_dataset active local pool_NAS feature@embedded_data active local pool_NAS feature@bookmarks enabled local pool_NAS feature@filesystem_limits enabled local pool_NAS feature@large_blocks enabled local pool_NAS feature@large_dnode enabled local pool_NAS feature@sha512 enabled local pool_NAS feature@skein enabled local pool_NAS feature@edonr enabled local pool_NAS feature@userobj_accounting active local pool_NAS feature@encryption enabled local pool_NAS feature@project_quota active local pool_NAS feature@device_removal enabled local pool_NAS feature@obsolete_counts enabled local pool_NAS feature@zpool_checkpoint enabled local pool_NAS feature@spacemap_v2 active local pool_NAS feature@allocation_classes enabled local pool_NAS feature@resilver_defer enabled local pool_NAS feature@bookmark_v2 enabled local pool_NAS feature@redaction_bookmarks enabled local pool_NAS feature@redacted_datasets enabled local pool_NAS feature@bookmark_written enabled local pool_NAS feature@log_spacemap active local pool_NAS feature@livelist enabled local pool_NAS feature@device_rebuild enabled local pool_NAS feature@zstd_compress enabled local pool_NAS feature@draid enabled local pool_NAS feature@zilsaxattr enabled local pool_NAS feature@head_errlog active local pool_NAS feature@blake3 enabled local pool_NAS feature@block_cloning enabled local pool_NAS feature@vdev_zaps_v2 enabled local So, alle neuen Features aktiviert. Jetzt kann der Pool weiterhin seine Arbeit machen.
F

Wireguard - Client installieren
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Wireguard linux wireguard
3

0 Stimmen

3 Beiträge

831 Aufrufe

F

Ich kann dir nicht ganz folgen. Mein Wireguard Server ist eine VM im Netz. Mein Smartphone baut zu diesem eine Verbindung auf und ich habe mal eben nachgeschaut, was da so geht. Mein Smartphone ist aktuell im meinem WLan angemeldet. [image: 1586458461693-6e0016dc-7e11-41e1-bba2-e52a3f1348df-image-resized.png] iperf3 -s -B 10.10.1.1 ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- Accepted connection from 10.10.1.10, port 44246 [ 5] local 10.10.1.1 port 5201 connected to 10.10.1.10 port 44248 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 4.98 MBytes 41.7 Mbits/sec [ 5] 1.00-2.00 sec 5.52 MBytes 46.3 Mbits/sec [ 5] 2.00-3.00 sec 4.80 MBytes 40.3 Mbits/sec [ 5] 3.00-4.00 sec 4.17 MBytes 35.0 Mbits/sec [ 5] 4.00-5.00 sec 5.04 MBytes 42.3 Mbits/sec [ 5] 5.00-6.00 sec 5.43 MBytes 45.6 Mbits/sec [ 5] 6.00-7.00 sec 5.75 MBytes 48.3 Mbits/sec [ 5] 7.00-8.00 sec 5.70 MBytes 47.8 Mbits/sec [ 5] 8.00-9.00 sec 5.73 MBytes 48.1 Mbits/sec [ 5] 9.00-10.00 sec 5.65 MBytes 47.4 Mbits/sec [ 5] 10.00-10.04 sec 206 KBytes 46.5 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.04 sec 53.0 MBytes 44.3 Mbits/sec receiver ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- Accepted connection from 10.10.1.10, port 44250 [ 5] local 10.10.1.1 port 5201 connected to 10.10.1.10 port 44252 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 4.80 MBytes 40.2 Mbits/sec 0 253 KBytes [ 5] 1.00-2.00 sec 14.7 MBytes 123 Mbits/sec 181 379 KBytes [ 5] 2.00-3.00 sec 9.68 MBytes 81.2 Mbits/sec 58 294 KBytes [ 5] 3.00-4.00 sec 8.88 MBytes 74.5 Mbits/sec 1 227 KBytes [ 5] 4.00-5.00 sec 7.76 MBytes 65.1 Mbits/sec 0 245 KBytes [ 5] 5.00-6.00 sec 8.88 MBytes 74.5 Mbits/sec 0 266 KBytes [ 5] 6.00-7.00 sec 9.81 MBytes 82.3 Mbits/sec 0 289 KBytes [ 5] 7.00-8.00 sec 7.82 MBytes 65.6 Mbits/sec 35 235 KBytes [ 5] 8.00-9.00 sec 5.59 MBytes 46.9 Mbits/sec 4 186 KBytes [ 5] 9.00-10.00 sec 6.64 MBytes 55.7 Mbits/sec 0 207 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 84.6 MBytes 70.6 Mbits/sec 279 sender ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- ^Ciperf3: interrupt - the server has terminated Im zweiten Teil ist der Wireguard Server der Sender. Bis jetzt hatte ich eigentlich nie Probleme, auch nicht unterwegs. Aber, ich gehe davon aus, das ich dich nicht 100% verstanden habe
F

Linux Befehle - ls & tail
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Linux linux
1

0 Stimmen

1 Beiträge

488 Aufrufe

Niemand hat geantwortet
F

Datensicherung zwischen zwei Server
Beobachtet Ignoriert Geplant Angeheftet Gesperrt Verschoben Linux linux
2

1

0 Stimmen

2 Beiträge

768 Aufrufe

F

Funktionskontrolle heute morgen war o.k. Schreibt die Daten aber noch ins falsche Verzeichnis, da muss ich nochmal ran.