Benutzer Diskussion:Stefan Kühn/Check Wikipedia/Archiv/2009/September

aus Wikipedia, der freien Enzyklopädie
Zur Navigation springen Zur Suche springen

Ideen zur Laufzeitproblematik

Ich habe im obrigen Abschnitt die Laufzeitproblematik gelesen. Ich habe mir daher einige Gedanken gemacht. Ich hoffe es hilft dir die Laufzeit zu verkürzen, bei gleichem Ergebnis. Ich hoffe auch, das du dich damit nicht angegriffen fühlst und es in dieser öffentlichen Form genehm ist. Ich möchte gerne helfen, da ich Teile der Fehler auch als nützlich ansehe und es die Qualität der Artikel verbessert diese zu beseitigen. Selber schaffe ich es leider nicht, immer den aktuellen Dump zu haben. Leider ist die Zahl der Verbesserungsvorschläge für eine Person auch zu viel. Viel Erfolg. Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Umherirrender-2009-09-01T16:56:00.000Z-Ideen zur Laufzeitproblematik11

Was ich noch vergessen habe: Hut ab vor der bisherigen Leistung. Wenn du einen Vorschlag umsetzen möchtest, mache es am besten getrennt von anderen Sachen und vergleiche die Ergebnisse (Ausgabedatei oder so). Nur dann kann man sich sicher sein das alles richtig ist (und merkt einen Laufzeitsunterschied, kann auch auch schlechter werden). Falls du meinst, dass die Vorschläge nichts bringen, okay, du musst sie umsetzen, ich würde es dir nicht übel nehmen. Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Umherirrender-2009-09-01T17:19:00.000Z-Umherirrender-2009-09-01T16:56:00.000Z11


Würde es nicht auch gehen, wenn du pro Projekt unterscheidest, ob du nun den großen (All pages, current versions only.) oder doch nur den kleinen Dump (Articles, templates, image descriptions, and primary meta-pages.) brauchst? Und dem entsprechenden das auswählst. Das würde für en die Laufzeit halbieren (ich nehme an, die haben keinen Sonder-Namensraum) --Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Umherirrender-2009-09-01T16:56:00.000Z-Ideen zur Laufzeitproblematik-111


Wenn du mit foreach etwas suchst, solltest du die Schleife vorzeitig abbrechen, wenn es gefunden wurde. Nach dieser Seite geht das mit last (Ich habe keine Ahnung von Perl-Programmierung). Einige ifs in Schleifen kann man dann auch entschlacken. --Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Umherirrender-2009-09-01T16:56:00.000Z-Ideen zur Laufzeitproblematik-211


Ich würde die Namensraumabfragen am Anfang machen, direkt nach dem der Artikel gelesen wurde und nicht innerhalb der Fehler. Wenn der Artikel keinen relevanten Namensraum hat, dann braucht es auch keinerlei Zerlegung des Wikitextes, wird eh alles ungenutzt verworfen. Ein weiterer Vorteil ist, das du für einzelne Projekte den Namensraum leichter kontrollieren kannst. (In der Initalisierungsphase für das aktuelle Projekt die passenden Namensräume in einem Array festlegen, wogegen dann geprüft werden kann. Beispielsweise kann es sein, dass der Namensraum 104 in anderen Projekten aufeinmal nicht interessant ist). Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Umherirrender-2009-09-01T16:56:00.000Z-Ideen zur Laufzeitproblematik-311


Super. Vielen Dank für die Tipps. Da ich mich selbst als fortgeschrittenen Anfänger bei Perl betrachte, nehme ich gern jeden Tipp entgegen. Derzeit liegt erstmal das Augenmerk auf dem neuen Interface, was ja gut angenommen wird. Da sind auch jetzt schon genügend Fehler gelistet. Aber vielleicht komme ich in den langen Winterabenden mal zu einer wirklichen neuprogrammierung oder massiven umstrukturierung. Meist wächst ja so ein Programm organisch und dann kann das schon mal etwas zeitintensiv sein. ich denke den meisten Performancegewinn kann ich in einigen internen Umstrukturierungen rausholen. Das mit dem Dump hab ich schon beachtet, ich nehme immer nur die Kleinen. Das mit den Namensräumen mach ich schon so, am anfang wird der Namensraum ermittelt, und bei jedem Fehler wird individuell ausgeschlossen. Ich wollte möglichst flexibel bleiben. Das mit dem abrechen der Schleifen mach ich schon da wo möglich. - Das insgesamte Problem ist einfach das Wachstum. Man muss immer bedenken, dass vielleicht heute es noch geht, aber in drei Jahren so nicht mehr möglich ist. Deswegen will ich auch eher weg vom Dump hin zu einer Art Live-Scan, bei der regelmässig in den Wikipedias z.B. die Letzten Änderungen abgegrast werden. Zusätzlich will ich für jeden Artikelscan auch ein Datum abspeichern um nicht dreimal am Tag den gleiche zu scannen. Aber das ist noch zukunftsmusik. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-01T18:56:00.000Z-Umherirrender-2009-09-01T16:56:00.000Z11

Error 082 in Finnish wikipedia

All the links starting with [[Wikipedia: (linking to Wikipedia namespace within fi-wiki) are included in the error report. --Jhattara Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Jhattara-2009-09-01T08:37:00.000Z-Error 082 in Finnish wikipedia11

IMHO: This is a error. We write a encyclopaedia and not a Wikipedia-project. So in every article should only links to other articles. Only with this permission you can use this data outside of wikipedia. Like in a book or in an other project. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-01T09:10:00.000Z-Jhattara-2009-09-01T08:37:00.000Z11
Most of the links to the Wikipedia namespace in Finnish Wikipedia are on the pages for years, decades, and centuries, where there is a link to the discussion about how to write time in Finnish Wikipedia. Those clutter the list beyond any usablitity. If the link [[Wikipedia:Keskustelua ajan merkitsemisestä Wikipediassa|ajan merkitseminen]] is included in errors, this error report will remain useless for the Finnish Wikipedia. --Jhattara Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Jhattara-2009-09-02T07:41:00.000Z-Stefan Kühn-2009-09-01T09:10:00.000Z11
Actually... Just checked that the link to discussion is a redirect. The correct place it should link in Finnish Wikipedia is [[Ohje:Merkitsemiskäytännöt]]. --Jhattara Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Jhattara-2009-09-02T07:43:00.000Z-Stefan Kühn-2009-09-01T09:10:00.000Z11
I understand the problem, we had the same in dewiki and in other languages. But this link should stand at the discussion page or in a comment inside the article. It should not stand inside the article text. For Example: If I read a article about the year 2001 I will not read how to write this article. - In the next time I will implement a Whitelist inside the new interface. I hope this will help for this problems. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-02T08:33:00.000Z-Jhattara-2009-09-02T07:43:00.000Z11

DEFAULTSORT (006 and 037)

Like the ca.wiki, the esperanto project has another name to the "DEFAULTSORT". We uses DEFAUxLTORDIGO, that creates a special letter ("DEFAŬLTORDIGO"). We have to maintain some special letters also in the sortkey ("Sahxarov" in the sortkey = Saĥarov). These "special letters" are allowed in that project: ĉ, ĝ, ĥ, ĵ, ŝ, ŭ and also Ĉ, Ĝ, Ĥ, Ĵ, Ŝ, Ŭ (in uppercase). This happens because they are different letters from c, g, h, j, s and u. Could them be ignored by the errors 006 and 037? If you need the unicode, just let me know. Thanks in advance. Castelobranco Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Castelobranco-2009-09-07T00:52:00.000Z-DEFAULTSORT (006 and 037)11

These letters are written with an "-x" ("cx", "gx", "hx", etc.) But the eo-mediawiki - and as I see, the Check Wikipedia dump either - recognizes them as diacritics (ĉ, ĝ, ĥ, etc.). Castelobranco Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Castelobranco-2009-09-07T00:57:00.000Z-Castelobranco-2009-09-07T00:52:00.000Z11
Many thanks for this info. I will fix this bug. I write this on my To-do-list -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T07:04:00.000Z-Castelobranco-2009-09-07T00:57:00.000Z11

Error 61 in ptwiki

The list of error 61 - Reference with punctuation (4-sep-09) there are some articles without this error that are shown in the list, like 105 Lélio Gama St. and 12758 (número). Rjclaudio Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Rjclaudio-2009-09-04T12:19:00.000Z-Error 61 in ptwiki11

I think this is from a old dump. If you want sure, that this is in the article then use this new page. There you found for a bot all articles from the database, where no user set this as "Done". You can set the limit there to 500 and also scroll with the parameter "offset". I hope this will help you. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T07:25:00.000Z-Rjclaudio-2009-09-04T12:19:00.000Z11

Could you change in the script the links at "List of all articles with error xxx" to this new url? Rjclaudio 01:38, 9. Sep. 2009 (CEST)

Sugestion to new errors with Defaultsort

Double Defaultsort, and Text after Defaultsort. Rjclaudio Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Rjclaudio-2009-09-04T12:19:00.000Z-Sugestion to new errors with Defaultsort11 01:47, 6. Sep. 2009 (CEST)

Double Defaultsort is a good idea. I write this at the To-do-list. But Text after Defaultsort is not possible. I have no good algorithm to detect this in de, en, es or ja, ar ... -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T07:16:00.000Z-Rjclaudio-2009-09-04T12:19:00.000Z11

If you can do this to category why cant use the same algorithm? Maybe you can create a error specific to some languages that you can make this easy. Rjclaudio Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Rjclaudio-2009-09-08T23:35:00.000Z-Sugestion to new errors with Defaultsort11

What about look defaultsort when the article is not categorized. In this situation it's useless (exemple). There are some... but perhaps you gonna say me it should be categorized ? -- - Archimëa ⇔ 10:48, 9. Sep. 2009 (CEST)

Bot-readable updates?

I notice that pages like [1] and [2] haven't been getting updated recently. It would be really nice if those could be updated, in addition to having the new interface, because it's much easier to use a bot when there is a plain-text list of articles to copy. Thanks! -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Drilnoth-2009-09-09T02:54:00.000Z-Bot-readable updates?11

Problem in frwiki also, all projects seem to have been updated today, but with the scans results made on monday. The new interface is not accessible anymore since yesterday night... -- - Archimëa ⇔ 10:24, 9. Sep. 2009 (CEST)
At the toolserver was a problem with the SQL-Server. This problem was fixed, but now the backup will be implemented. See this mail. - To the problem with the error-lists: This is a bigger problem. At the moment I am happy that the new interface is running very well and the user use this. Also at the moment only the errors from the live wikipedia (and not from a dump) is inside the database. Only new articles and last changes will be scanned and insert into the database. This is also the reason for the low numbers of errors in the new interface. Maybe in dewiki only 14000 or so. In the dump the script find over 100000 errors. In the next days I will create a picture about the processes so that everyone understand the details. The biggest problem is after a dumpscan sometime over 300000 articles must be scanned in the live wikipedia and this is too much. - For all user with bot I have implement a output list in the new interface, also the function to set all articles as done. I hope this help. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-09T08:51:00.000Z-Drilnoth-2009-09-09T02:54:00.000Z11
Next problem see this mail. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-09T20:26:00.000Z-Stefan Kühn-2009-09-09T08:51:00.000Z11
Wonderful, it works. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-10T11:48:00.000Z-Stefan Kühn-2009-09-09T20:26:00.000Z11
No, Stephan, it seems that it doesn't work for the french wiki: files are dated "1 Sep 09 09:08", but dump is already dump of monday. 79.87.11.144 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-79.87.11.144-2009-09-12T09:49:00.000Z-Stefan Kühn-2009-09-10T11:48:00.000Z11

WikiCleaner

Hi,

I have started working on Wiki Cleaner to add features in it for fixing the errors detected by your script. Version 0.93 is the first one with this. It's not yet functional and I still have a lot of work to do on it, but the basics are visible.

Main things that needs to be done :

  • Allow editing and saving the contents of the articles
  • Highlight detected errors directly in the text of the articles and propose fixes
  • Add other errors (currently only errors 48 and 80 are recognized)
  • Read complete list of articles on the tool server

If people have comments about this tool, please use my talk page on FR.

--NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-01T12:13:00.000Z-WikiCleaner11

v0.94 is available : the page text is scanned and errors are highlighted directly in the text. Still not functional, since editing and saving are not done. --NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-01T20:16:00.000Z-NicoV-2009-09-01T12:13:00.000Z11

Hi Stefan, I have released v0.95 that allows editing and saving the articles, and also detects other errors (11 types currently). --NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-04T17:45:00.000Z-WikiCleaner11

Due to a hosting change, there's a new URL to install WikiCleaner (here). It's better to uninstall the old version before going for the new one (0.97). --NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-14T19:28:00.000Z-WikiCleaner11

Hidden auto-redirect

Hello Stephan,

Is it possible to detect "hidden auto-redirect"? I mean by "hidden auto-redirect" a circular redirect by a redirect article. For example in the french interwiki, we have the article fr:Intercalation (mesure du temps) which includes a link (at the end of the article) to fr:Mois intercalaire which return the reader to the first article fr:Intercalation (mesure du temps) because fr:Mois intercalaire which is not an article but only a redirect to fr:Intercalation (mesure du temps).

This error looks like the error 48 with a redirect article.

I hope I have been clear, if no please let me know, Regards,79.87.11.144 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-79.87.11.144-2009-09-12T10:08:00.000Z-Hidden auto-redirect11

I know what you mean, but the script can't handle this problem. It only check one article and not more. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-14T19:29:00.000Z-79.87.11.144-2009-09-12T10:08:00.000Z11

False positives in error #64

The Catalan wikipedia has lots of false #64 positives. See, for instance ca:Argolis or ca:Belau. They all seem to be redirects, and the repeated link is reported to be to the original page, but I can't find it. Can you look into it?

Also, I have a request for error #69 that you probably overlooked. We get the same false positive that the Italian wikipedia has reported for #69 in ca:Lector de codi de barres, and we will get it in ca:ISBN if it gets inspected. Can you please white-list them? --Joutbis Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Joutbis-2009-09-15T17:47:00.000Z-False positives in error #6411

To the problem with #64 see in the redirect Belau. There you find the problem. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T19:42:00.000Z-Joutbis-2009-09-15T17:47:00.000Z11
To the problem with #69, I work one a concept for a white list. I hope this will help in the future. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T19:44:00.000Z-Stefan Kühn-2009-09-21T19:42:00.000Z11

New error for fixing references

What i think about lokks like error 081. I can find often reference with the "name=" parameter but the reference is only used one time. So, this parameter is useless.

Example here

-- - Archimëa ⇔ 12:40, 18. Sep. 2009 (CEST)

Hmm, I write this at the to-do-list. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T19:49:00.000Z-New error for fixing references11

New interface - Graphic bug

Error 031, certainly due to HTML entities... -- - Archimëa ⇔ 23:13, 19. Sep. 2009 (CEST)

I know this problem. The problem is the two ways of display. Wikisyntax and HTML. Maybe I will stop the wikisyntax and only support the webinterface with HTML. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T19:56:00.000Z-New interface - Graphic bug11

DEFAULTSORT (037)

Hallo Stefan, in der italienischen Wikipedia gibt es für Biografien die Vorlage Bio. Die hat einen Parameter ForzaOrdinamento ("Sortierreihenfolge"), der wie DEFAULTSORT funktioniert und eine bestimmte Sortierung erzwingt. Beispiel: ForzaOrdinamento = Cajkovskij, Petr Ilic im Artikel it:Pëtr Il'ič Čajkovskij. Wäre es vielleicht möglich, diesen Parameter zu berücksichtigen? Im einfachsten Falle könnte man Artikel mit Bio von der Überprüfung ausschließen. Im Idealfalle würde dein Skript nicht nur nach DEFAULTSORT, sondern auch nach dem Parameter ForzaOrdinamento suchen. Nur so als Anregung :) --MaEr Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-MaEr-2009-09-17T18:27:00.000Z-DEFAULTSORT (037)11

Zur Zeit bin ich sehr beschäftigt. Das muss ich erstmal nach hinten schieben. Ich schreib es mal auf die To-Do-Liste. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T19:45:00.000Z-MaEr-2009-09-17T18:27:00.000Z11
You can add template FD from dawiki - same problem. --Steenth Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Steenth-2009-09-26T22:02:00.000Z-Stefan Kühn-2009-09-21T19:45:00.000Z11

New interface

Here is the new interface. The basic functionality is implemented. But I think I can add in the next time many more. If you have ideas for new features then tell this here. In the next time I will implement a whitelist and also a better updating of the data. -- sk

The links to Japanese Wikipedia and its translation page are broken. Possibily character encoding problem? --fryed-peach Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Fryed-peach-2009-09-01T08:48:00.000Z-New interface11
Yes, I have see this too. Also in other languages (ru, ar). I will fix this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-01T09:05:00.000Z-Fryed-peach-2009-09-01T08:48:00.000Z11
Hi, I made some testing, it appears to be handful. No request, only what i think ;) ... It's clean, and "squarred"... Time of loading page are good. Colors.. (every tastes is in the world ! perhaps it will be twaekable...)... I don't know the way you thought it... an include on each project page will be possible... The done button is awesome ! The possibility to have a big output number ( ← 100 bis 125 → for example, is really useful) -- Cordialement - Archimëa ⇔ 1 septembre 2009 à 14:06 (CEST)
Hi. Just to be sure : links like this one will still be available ? It's just for tools being able to read the list of errors. --NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-01T12:23:00.000Z-Fryed-peach-2009-09-01T08:48:00.000Z11
Maybe in the future I will implement this inside the script. So only the link will be change in the future. But the page will be available. Maybe under "&view=bot" or so. Is this ok for you? -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-01T12:28:00.000Z-NicoV-2009-09-01T12:23:00.000Z11
Yes "&view=bot" should be ok. The idea is just to have a simple list (minimal formatting to have a simple parsing, ideally only a text file with a title per line) with all articles where a specific error has been detected. --NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-01T13:29:00.000Z-Stefan Kühn-2009-09-01T12:28:00.000Z11
Hello Rjclaudio, to 1) this is a good idea, but sometime we had vandales. I will check this, but later. To 2) Yes this will be possible. This is also my next idea. I will try this. First I must fix some basic problems at the database. To 3) Why is this usefull? I think it is ok, but I can also exclude this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-02T20:13:00.000Z-Rjclaudio-2009-09-02T19:13:00.000Z11
Could this "done button" at least delete page errors (25 entry) instead of the whole error list ? -- - Archimëa ⇔ 13:29, 3. Sep. 2009 (CEST)
Navigation problem, Example : When i'm fixing problem in an error list (ex: "Square brackets not correct begin") : if i choose "more" for an article, i go on the "article page error" (i will name it like that), and then it's hard and time wasting to go back to the first error list i come from (in my example "Square brackets not correct begin")... -- - Archimëa ⇔ 13:38, 3. Sep. 2009 (CEST)
Hello Archimëa, I hope I have fix this "navigation problem" for you. The line will not delete only the "done" will switch in "ok". So you can go back to the first page. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-04T06:22:00.000Z-Stefan Kühn-2009-09-02T20:13:00.000Z11

I suggest more ways to agroup the errors. Some projects use "BOT" e "AWB" in the name. If this interface could agrupo in the same table all error that a AWB can fix it will help a lot, and will be a good advantage over the old version. And maybe not only AWB/BOT, but the options could be customized in many ways by each project independently. Rjclaudio Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Rjclaudio-2009-09-04T11:00:00.000Z-New interface11

Yes i saw it yesterday night, it's far better and it resolve the problem. Nice -- - Archimëa ⇔ 13:21, 4. Sep. 2009 (CEST)

Hi Stefan, a question about the "Done" button. Does it mark the problem as solved (until the next run of Check Wiki ?) so that people fixing errors can work more efficiently ? I am still working on WikiCleaner to provide an interface for fixing the errors (hopefully, a functional version before the end of the week-end), is there a way for my tool to simulate easily the click on the "Done" button ? --NicoV Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-NicoV-2009-09-04T14:02:00.000Z-New interface11

@Rjclaudio: If I understand you right, then you want the info AWB/BOT or so for every error number. This is possible, but I need a list from AWB and Bots, maybe for every language? -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-04T14:38:00.000Z-NicoV-2009-09-04T14:02:00.000Z11
@NicoV: See the Done-Link. You need only to send the this http://toolserver.org/~sk/cgi-bin/checkwiki/checkwiki.cgi?project=dewiki&view=only&id=30&pageid=4534003 if you fix in de the error 30 for page 4534003. The script set make an update in the database ok=0 → ok=1 and not more. With the next scan at the moment all pages with ok=1 will be scanned. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-04T14:38:00.000Z-Stefan Kühn-2009-09-04T14:38:00.000Z11

Just like Wikipedia:Projetos/Check Wikipedia/Tradução, another page to associate error <-> bot/awb/manual/semi-bot/etc. Or using something like "error_091_desc_script=", but a " error_091_clas_script=2" (clas = classification). And just like

#########################
# error description
#########################
# prio = -1 (unknown)
# prio = 0  (deactivated) 
# prio = 1  (top priority)
# prio = 2  (middle priority)
# prio = 3  (lowest priority)

do a

#########################
# clas description
#########################
# clas = 0  (manual)
# clas = 1  (awb)
# clas = 2  (bot)

but with unlimited clas (or max 10). Each language would use 3 (manual, awb, bot) or 5 (manual, partial awb, awb, partial bot, bot), or 20.

And it could help integrate each project, working together to create rules for bot/awb to fix similar errors. In pt.wiki we have 52 error that use bot/awb, and maybe other languages have rules for the others. This would help find help in other languages.

Rjclaudio Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Rjclaudio-2009-09-04T15:28:00.000Z-New interface11

Maybe showing rules to awb to fix some the errors. In pt.wiki we made it to some errors, but something universal (that each project would adjust, like changing in the rule "Image" for "Imagem") would be better. Rjclaudio Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Rjclaudio-2009-09-04T15:32:00.000Z-Rjclaudio-2009-09-04T15:28:00.000Z11
When you are on a page for example at "75 to 100" -> if you clilk the done button you go back to the previous 25 entrys page... -- - Archimëa ⇔ 01:05, 6. Sep. 2009 (CEST)
 Ok, I have fix this bug. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-06T11:51:00.000Z-Rjclaudio-2009-09-04T15:32:00.000Z11

Hello, I love new interface, but I am also begging for button "all done". :) --Ragimiri Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Ragimiri-2009-09-07T11:37:00.000Z-New interface11

Ok, I will try to implement this. :-) But with many questions like "You are sure?"-- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T13:22:00.000Z-Ragimiri-2009-09-07T11:37:00.000Z11
 Ok, I have implement this function. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T20:03:00.000Z-Ragimiri-2009-09-07T11:37:00.000Z11
Description in error 063 contains a "small" tag. It create a graphic bug... -- - Archimëa ⇔ 20:50, 7. Sep. 2009 (CEST)
At the moment all description are bad. Because the include Wikisyntax and no html. I will fix this with a translation page. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T20:03:00.000Z-Stefan Kühn-2009-09-07T20:03:00.000Z11
If i'm right errors are sorted in high/medium/low based on the srcipt level and not wiki project level (maybe it will be done with including translation, because level are set there ?).
Undefined width for table are less usefull with some errors. When the table is larger than screen it's a pitie... to see how it is, see this error -- - Archimëa ⇔ 11:12, 10. Sep. 2009 (CEST)
Yes, at the moment only the script level will be used. I don't see the problem with the table. I think flexible is ok. Please use new headlines, for new requests. I dont like so long discussions. :-) -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-10T11:48:00.000Z-Stefan Kühn-2009-09-07T20:03:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:03, 4. Okt. 2009 (CEST)

Page moved (eo.wiki)

The esperanto project page was moved from eo:Vikipedio:WikiProjekt Check Wikipedia to eo:Projekto:Check Wikipedia, because of the creation of the namespace Projekto. Should I do something to correct the interwikis? The translate page was also moved from eo:Vikipedio:WikiProjekt Check Wikipedia/Translation to eo:Projekto:Check Wikipedia/Tradukado. Thanks. Castelobranco Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Castelobranco-2009-09-07T01:29:00.000Z-Page moved (eo.wiki)11

Hello Castelobranco, thanks for this info. I will fix this in the script. And then with the next scan at Wednesday all page will have the right interwiki link. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T06:55:00.000Z-Castelobranco-2009-09-07T01:29:00.000Z11
 Ok, I have change this in the script. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-07T18:31:00.000Z-Stefan Kühn-2009-09-07T06:55:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:08, 4. Okt. 2009 (CEST)

Error Code 047:

Hello Stefan Kühn,

M.e. sind diese nicht falsch [3]] (Ordinaalgetal) ist kein Template, aber hat etwas mit Mathematik zu tun. Grüss. --Algont Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Algont-2009-09-07T20:44:00.000Z-Error Code 047:11

fixed. dann einfach <nowiki>-tags drumsetzen. (oder auch <math>-tags, wenns passt) --xAwOc Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-AwOc-2009-09-07T20:58:00.000Z-Algont-2009-09-07T20:44:00.000Z11
Besser wäre <math></math>. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-08T06:36:00.000Z-AwOc-2009-09-07T20:58:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:08, 4. Okt. 2009 (CEST)

Table max width (new interface)

When the table is larger than screen (my screen is only 19') it's not usefull. All "Done button" are not displayed", you must use the horitonal bar... IF you have 10, 15, 20 times to do this ("done"->then H-bar, "done"->then H-bar, "done->then H-bar, etc...) :-(

To see the rendering of the problem, it may depends on your screen width. Example (hoping it's width enough on your screen), but looking at this, it seems to already have a maximum width, no ?

No way for it to be based on OS screen resolution for example ? (i don't know if it's easy to code... !) -- - Archimëa ⇔ 16:31, 10. Sep. 2009 (CEST)

Hello Archimëa, the problem is most one article with a big nobreakable notice. For example "{{Löschantragstext|tag=4|monat=September|jahr=2009|titel=Fachverband…". If you have done this one then you have a smaller table. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-10T18:52:00.000Z-Table max width (new interface)11
Yes, with "a big nobreakable notice"... Indeed, when there is one, it's not a problem, only when there is a big amount... OK... it was only a suggest... -- - Archimëa ⇔ 22:01, 10. Sep. 2009 (CEST)
Dieser Abschnitt kann archiviert werden. sk 21:08, 4. Okt. 2009 (CEST)

1 week without update for frwiki

Hallo
It's one week frwiki_output_for_wikipedia.html have not been updated... All users went far away from the project ! Last updated have been made last monday. Can you do something ? -- - Archimëa ⇔ 10:20, 14. Sep. 2009 (CEST)

I will check this tonight. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-14T15:25:00.000Z-1 week without update for frwiki11
BTW, we have tried to activate a detection on sep 9, perhaps that's the reason for the error ... We de-activate it for next check ... Al1 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Al1-2009-09-14T15:49:00.000Z-Stefan Kühn-2009-09-14T15:25:00.000Z11
I have check my script and don't find an error. Maybe this activation was the problem. I will check this again. Thanks for this tipp. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-14T19:31:00.000Z-Al1-2009-09-14T15:49:00.000Z11
We can hope a (simple) scan for fr this night ? -- - Archimëa ⇔ 22:13, 14. Sep. 2009 (CEST)
A suggestion (maybe a stupid suggestion) : perhaps you should erase or rename the frwiki directory, then create an empty new one ? If it's a disk error due to the crash of last wednesday... Al1 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Al1-2009-09-15T04:57:00.000Z-Stefan Kühn-2009-09-14T19:31:00.000Z11
Yesterday I had not enough time to check all. The script run currect. You see this at this page, when the last update was. But I don't understand why after the run this file is not updated. I will check this night again. The mystic is that all other languages are ok. I will check this tonight again. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-15T11:39:00.000Z-Al1-2009-09-15T04:57:00.000Z11
I will check this again. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-19T18:31:00.000Z-Stefan Kühn-2009-09-15T11:39:00.000Z11
I added on the translation page : == Code == right before <pre>...translation...</pre> . Can this be the problem ? -- - Archimëa ⇔ 13:45, 15. Sep. 2009 (CEST)
I have found the problem. It is inside fr:Orthose. The text "Gordon\'s Mineralogy of Pennsylvania (1922) p. 191" has a "\'". This make a problem in the script. When the script insert something in the database it must mask a ' with \' but here is this mask in the text. The script stop at this point and don't copy the new pages for frwiki. I will fix this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-15T18:54:00.000Z-Stefan Kühn-2009-09-15T11:39:00.000Z11
Great ! but what a pitie... for only a typo... Good thing is fixing it, it will never come again... I also fixed the title of the book ;) ... -- - Archimëa ⇔ 21:28, 15. Sep. 2009 (CEST)
Hello! The same problem is in ruwiki. It's since 02.09.2009 ruwiki_output_for_wikipedia.html have not been updated. Can you also fix this? --SPKirsch Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-SPKirsch-2009-09-15T19:51:00.000Z-Stefan Kühn-2009-09-14T15:25:00.000Z11
I think is going to handle this \' problem directly in the script for all language... -- - Archimëa ⇔ 21:54, 15. Sep. 2009 (CEST)
The update have been done for frwiki, thanks... it appears ruwiki is not updated... -- - Archimëa ⇔ 13:57, 16. Sep. 2009 (CEST)
Suchlauf ist bei ruwiki wieder durchgelaufen, ist ja auch im Interface zu sehen. Aber diese ruwiki_output_for_wikipedia.html wird einfach nicht aktualisiert. Was läuft da schief? --SPKirsch Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-SPKirsch-2009-09-17T20:49:00.000Z-SPKirsch-2009-09-15T19:51:00.000Z11
Hallo SPKirsch, ich arbeite dran, aber ich konnte den Fehler noch nicht ausmerzen. Hoffe dass ich diese Woche mehr Zeit habe, um das Problem zu beheben. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T07:47:00.000Z-SPKirsch-2009-09-17T20:49:00.000Z11
So ich hab jetzt was geändert und hoffe, dass es durchläuft. Mal schauen. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T19:40:00.000Z-Stefan Kühn-2009-09-21T07:47:00.000Z11
Mist klappt immer noch nicht. Das muss was größeres sein. Das Problem ist, ich kann auf der Commandozeile schlecht die russischen Titel lesen. Da muss ich mal weiterschauen. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T20:14:00.000Z-Stefan Kühn-2009-09-21T19:40:00.000Z11
same for uk: - no update since 2009-09-29 21:50 --xAwOc Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-AwOc-2009-10-02T21:28:00.000Z-SPKirsch-2009-09-15T19:51:00.000Z11

 Ok, I have fix the problem. It was a difficult problem. Hard to catch. I will describe the problem: The script use the API. For a faster script I check more the one title at one time. I use the limit of 25 titles because the Url for the API can't be longer then a maximum. This work very well. But when the scrip scan a language with no Latin letters like Cyrillic (ruwiki) the script has a problem. The letter must be transform ("Военно-воздушные силы и войска ПВО Узбекистана" in "%D0%92%D0%BE%D0%B5%D0%BD%D0%BD%D0%BE-%D0%B2%D0%BE%D0%B7%D0%B4%D1%83%D1%88%D0%BD%D1%8B%D0%B5%20%D1%81%D0%B8%D0%BB%D1%8B%20%D0%B8%20%D0%B2%D0%BE%D0%B9%D1%81%D0%BA%D0%B0%20%D0%9F%D0%92%D0%9E%20%D0%A3%D0%B7%D0%B1%D0%B5%D0%BA%D0%B8%D1%81%D1%82%D0%B0%D0%BD%D0%B0") So this API work with this link. Every letter will be transformed in 3 letters. This is in the most case no problem, but sometime if the 25 titles are very very long, then this is a problem. For example see this API-Request. I have fix this problem in the script and hope it work. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-03T19:09:00.000Z-1 week without update for frwiki11

Oups there is a new problem. I will fix this tomorrow. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-03T20:40:00.000Z-Stefan Kühn-2009-10-03T19:09:00.000Z11
 Ok, now it run! I have updated the page in ruwiki. Now I will start the scan of ukwiki. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-04T08:59:00.000Z-Stefan Kühn-2009-10-03T20:40:00.000Z11
Danke!!! Thank you!!! Well done.--SPKirsch Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-SPKirsch-2009-10-04T12:20:00.000Z-Stefan Kühn-2009-10-04T08:59:00.000Z11

Error #003 on italian wikipedia

Hi! In it.wiki template:R is a redirect of {{References}}. Some pages that contain {{r}} are listed as errors. Can you insert that template in your script? Thanks! --Beta16 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Beta16-2009-09-21T13:30:00.000Z-Error #003 on italian wikipedia11

I have insert this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-21T20:04:00.000Z-Beta16-2009-09-21T13:30:00.000Z11
Very fast. Thanks! :) --Beta16 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Beta16-2009-09-22T08:42:00.000Z-Stefan Kühn-2009-09-21T20:04:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:10, 4. Okt. 2009 (CEST)

commons

on commons user:Rocket000 deleted the sentence „There has to be a space in between "br" and the slash.“ there is no translation page on commons . on the new interface the page commons:Wikipedia:WikiProject Check Wikipedia/Translation is noted as translation page which is a interwikilink to en:WikiProject Check Wikipedia/Translation, which isn't existing. it should be commons:Commons:WikiProject Check Wikipedia/Translation, but that isn't existing too. --xAwOc Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-AwOc-2009-09-25T09:35:00.000Z-commons11

At the moment I have a problem with commons. I hope I can fix this at the weekend. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-09-25T15:03:00.000Z-AwOc-2009-09-25T09:35:00.000Z11
Speaking of Commons, it would be really great to have a Commons listing in the new interface (always including files as though they were mainspace pages, maybe, although I know that might be too many pages for the script to handle). -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Drilnoth-2009-10-02T18:49:00.000Z-Stefan Kühn-2009-09-25T15:03:00.000Z11
 Ok, I have insert commons at the new interface and update the translation page. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-04T09:57:00.000Z-Drilnoth-2009-10-02T18:49:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:10, 4. Okt. 2009 (CEST)

error 003 on hewiki

this error (article has a <ref> and not a <references />) is identified 2210 times because we often use a template instead of <references />. The template is "הערות שוליים" and it can appear as (read from right to left):

{{הערות שוליים}}

-or-

{{הערות שוליים|anything here}}

can you please fix the script for this ? thanks, Mikimik Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Mikimik-2009-09-26T20:40:00.000Z-error 003 on hewiki11

also we use a template - "הערה" - instead of <ref>. thanks again, Mikimik Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Mikimik-2009-09-27T21:33:00.000Z-Mikimik-2009-09-26T20:40:00.000Z11
 Ok, I have insert this two references. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-04T12:28:00.000Z-Mikimik-2009-09-27T21:33:00.000Z11

Dump Scan

Can we have a dumpscan for frwiki ? (a new dump was finished 2 weeks ago)

All errors usually fixed are always around 0 to 10 per scan (75% of the 91 errors). All others errors have thousands and were more or less woking on it... -- - Archimëa ⇔ 17:34, 30. Sep. 2009 (CEST)

We have some errors not detected : Take a look at Liste d'articles non-détectés, perahps this articles will be detected with the new dump ? -- - Archimëa ⇔ 16:56, 3. Okt. 2009 (CEST)
Bump. It's possible ? please ? -- - Archimëa ⇔ 16:48, 6. Okt. 2009 (CEST)
 Ok -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-07T19:55:00.000Z-Dump Scan11
Thx for the dump -- - Archimëa ⇔ 13:29, 8. Okt. 2009 (CEST)

New suggestion: mixed cyrillic/latin letters in a word

Hi Stefan,

Could you extend your script in order to search words that contains cyrillic chars as well as latins? For example sometimes latin A (U+0041) is accidentally replaced by cyrillic А (U+0410). Probably a simple regular expression could recognize this case. Something like this:

$text =~ /[\x{0400}-\x{04f9}][A-Za-z\x{00c0}-\x{00ff}\x{0100}-\x{0233}]/;
$text =~ /[A-Za-z\x{00c0}-\x{00ff}\x{0100}-\x{0233}][\x{0400}-\x{04f9}]/;

-- Bitman Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Bitman-2009-09-30T05:04:00.000Z-New suggestion: mixed cyrillic/latin letters in a word11

I have never seen a regexp like this. Work this? Very interessting. At the moment I work at the translation of the new interface. After this I will try your idea. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Stefan Kühn-2009-10-05T19:18:00.000Z-Bitman-2009-09-30T05:04:00.000Z11

It is like /[a-z][A-Z]/ but uses Unicode chars. This is a code snippet of my bot that repairs error 16: hu:User:GumiBot/code16. Yes, it works. :-) --Bitman Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/September#c-Bitman-2009-10-09T13:36:00.000Z-New suggestion: mixed cyrillic/latin letters in a word11