Benutzer Diskussion:Stefan Kühn/Check Wikipedia/Archiv/2009/Juni

aus Wikipedia, der freien Enzyklopädie
Letzter Kommentar: vor 14 Jahren von Stefan Kühn in Abschnitt Falsches Gradzeichen
Zur Navigation springen Zur Suche springen

Feature request: instant feedback

Could you have a link next to the results of each check which triggers that check to run again and updates the list of errors? It would only have to re-check the first 50 errors, as those are the ones shown. This would stop people checking errors which had already been fixed, or making changes which don't actually fix the errors.

Sorry I don't understand. Did you mean a link, which start the script to check the next 50 articles? Please describe it better. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-01T15:49:00.000Z-Feature request: instant feedback11
Sorry, let me give a clearer example. Consider the section http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Check_Wikipedia#Double_pipe_in_one_link which has a list of 50 articles out of 860. Now imagine there was a link / button at the start of this section, and when you clicked it the script would run the test for "Double pipe in one link" against those 50 results, and remove from the list any that had been fixed. The list would stay at 50 results, though, by being repopulated from the remaining 810 (which are assumed to be still broken).
Hello IP, ok I understand this feature request. At the moment I can not include this in my script. It is very complex and I am happy that it work. But I think about the next generation of this script and this feature will be a good one. But than I can not include this in Wikipedia because I have no idea how to do this. I can write an special page in perl and include this there, but this need time. Many time! :-) -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-02T07:44:00.000Z-Stefan Kühn-2009-06-01T15:49:00.000Z11
In the next generation of the script you could have the data stored in a database (if you don't already) and when someone asks for one type of error to be retested, the script can pull out the top 50 results for that error from a database table, retest each one and delete any which are now fixed. Then you can have a function in the script which takes the top 50 results for each error and generates the updated wiki page for it, replacing the old one. I don't actually know how your script works, but it is impressive. Maybe you could explain what is hard about this feature. If you can't use a database, then you could have a separate page for each error which stores the data in tab-delimited format, for example.

Check 75

Siehe bitte [1]. In dem Fall wäre es wohl besser ein einfaches Aufzählungszeichen zu verwenden. --Matthäus Wander 00:54, 2. Jun. 2009 (CEST)

Ich hab mal die sinnlose Einrückung ausgebaut. Die Überschriften reichen zur Differenzierung. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-02T07:40:00.000Z-Check 7511

Check #71 - X at wrong position for ISBN

I'm trying to figure out if there are some false positives for this check, on the Chinese wikipedia. The description of the script said that it checks that X is at position number 10 only. Some of the results for the Chinese wikipedia are length 13 ISBN. Are the X supposed to be at position 10 or 13 for them? For example, zh:旋風管家 contains ISBN of 978-4-09-127272-X. --Vina Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Vina-2009-06-03T06:44:00.000Z-Check #71 - X at wrong position for ISBN11

Only in a ISBN-10 is a "X" allowed. If you find a "X" in an ISBN-13 then it is an error. See ISBN. If the checksum in ISBN-10 is a 10 then you write "X". If the checksum in a ISBN-13 is a 10 then you write "0". So a "X" in an ISBN-13 is wrong. Sometime the publisher write the wrong number at the book. In en/de/sv we have a template for this wrong ISBN-Numbers. See en:Template:Listed Invalid ISBN. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-03T06:56:00.000Z-Vina-2009-06-03T06:44:00.000Z11

error ? in check #34

Why this check not discover [2] and [3]. There was | class="float{{{1}}}" width="{{{width}}}" align="{{{1}}}" style="background-color:inherit;border-collapse:collapse;border-style:none;margin: .5em .75em;" some code form pl:Template:CytatD with {{{. Malarz pl Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Malarz pl-2009-06-05T09:39:00.000Z-error ? in check #3411

sub error_034_template_programming_elements is checking @lines, which contains no table code. In those articles template elements (parametrs was in a table). Malarz pl Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Malarz pl-2009-06-05T09:50:00.000Z-Malarz pl-2009-06-05T09:39:00.000Z11
At the moment I exclude tables from the check. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-05T10:06:00.000Z-Malarz pl-2009-06-05T09:50:00.000Z11
why ? Malarz pl Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Malarz pl-2009-06-06T06:14:00.000Z-Stefan Kühn-2009-06-05T10:06:00.000Z11
When I wrote this script, I had many problems with tables. Some problems I have never solved. For example: table inside table. It is very tricky to check this table. So I make in the first time the quick and dirty way. I exclude the tables. :-) Maybe in the future I must change my way. But this need a little time. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-06T19:30:00.000Z-Malarz pl-2009-06-06T06:14:00.000Z11

wrong <pre> exclusion

I found, that your script propably check code in <pre style="height:20em; overflow-y:scroll">. It works fine, when <pre> tag is used without parameters, but not in this case. The problem is in pl:dmesg and check #56 and propably few next articles. Malarz pl Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Malarz pl-2009-06-03T20:48:00.000Z-wrong <pre> exclusion11

I will check this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-04T10:16:00.000Z-Malarz pl-2009-06-03T20:48:00.000Z11
 Ok, I have change the code. If you see it again, then tell this here. Thanks. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T11:38:00.000Z-Stefan Kühn-2009-06-04T10:16:00.000Z11

DEFAULTSORT parameter starting with a white space

Hello Mr. Kühn,

Sometimes, I found DEFAULTSORT starting with a white space :

{{DEFAULTSORT:             Doe, John}}

This is a mistake.

Keep on the good work.

Regards,

Cantons-de-l'Est Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Cantons-de-l11

Very interesting idea. I will try to insert this. Thanks. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-04T20:28:00.000Z-Cantons-de-l11
 Ok, I insert the new error 88. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T19:46:00.000Z-Stefan Kühn-2009-06-04T20:28:00.000Z11

check 016

Hello, I just found a problem with check 016 when I tried to delete control character #x200B in the article cs:Czechowice-Dziedzice. It seems to me that this character is somehow conected to some IPA characters like "͡" in this example and when I try to delete control character I also delete IPA character. Is there any solution on my side or some exception in the script is needed? --Reaperman (cs) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Reaperman (cs)-2009-06-04T12:51:00.000Z-check 01611

I have read yesterday somewhere the same problem. But now I don`t find it! I will check this at the weekend. I think I must exclude this character. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-04T20:32:00.000Z-Reaperman (cs)-2009-06-04T12:51:00.000Z11
Ahja, here I see this yesterday. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-05T05:55:00.000Z-Stefan Kühn-2009-06-04T20:32:00.000Z11
 Ok, I have change the code for error 16. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T11:41:00.000Z-Stefan Kühn-2009-06-05T05:55:00.000Z11

<noinclude> and others

Maybe <noinclude> in article space is not error, but <noinclude></noinclude> or <noinclude>\n</noinclude> (with newline) is. The same with tags <includeonly> and <onlyinclude>. Examples: [4], [5] Malarz pl Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Malarz pl-2009-06-05T09:34:00.000Z-<noinclude> and others11

Interesting idea. I will try this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-05T10:06:00.000Z-Malarz pl-2009-06-05T09:34:00.000Z11
 Ok, new error 85. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T11:55:00.000Z-Stefan Kühn-2009-06-05T10:06:00.000Z11

I can't remember if I already asked for this one or not; if so, my apologies... it must have been archived. Anyway, I was wondering if the script could detect external links which have double brackets, rather than single brackets, around them? This causes display errors like [this]. It would also be great if it could search for external links which contain a pipe | symbol, since this is often a sign of someone trying to separate the link's target and its description in the same way as with an internal link. Thank you! Keep up the great work! -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-06T16:57:00.000Z-Misformatted external links11

 Ok, new error 86. I dont check for the pipe symbol, because I found a courrect weblink with a pipe. Somthing like http://www.xyz.abc?test=asd|asasd&sdfsdf. --sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T12:08:00.000Z-Drilnoth-2009-06-06T16:57:00.000Z11
Good point; thanks! -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-07T15:56:00.000Z-Stefan Kühn-2009-06-07T12:08:00.000Z11

Broken character entity references

(from en:Wikipedia talk:WikiProject Check Wikipedia)

Any chance you could run a script to find things like [6], [7]. I've been finding a lot of these lately where the semi-colon is missing. Obviously this would be listed as a higher-priority error. — CharlotteWebb 21:07, 6 June 2009 (UTC)

Thought I'd mention it here. -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-06T23:27:00.000Z-Broken character entity references11
 Ok, new error 87. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T12:25:00.000Z-Drilnoth-2009-06-06T23:27:00.000Z11
Excellent; thank you. -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-07T15:56:00.000Z-Stefan Kühn-2009-06-07T12:25:00.000Z11
Hello. Could you remove external links from checking this error? It seems that it makes false positives. --Reaperman (cs) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Reaperman (cs)-2009-06-08T09:22:00.000Z-Drilnoth-2009-06-06T23:27:00.000Z11
I will fix this today. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-08T11:27:00.000Z-Reaperman (cs)-2009-06-08T09:22:00.000Z11
 Ok, I have deactivated this error. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-08T19:45:00.000Z-Stefan Kühn-2009-06-08T11:27:00.000Z11

Wrong description

English description for error 86 is wrong.

The script found a link with two brackets to external source like [[http://www.wikipedia.org Wikipedia]]. External links only need one bracket like [[http://www.wikipedia.org Wikipedia]].

The second example should have only one pair of brackets.

And desc for error 85 has extra ". at the end.

The script found a tag without content or a line break like <noinclude></noinclude>. This tag can be deleted.".

I don't like the hack used in desc for error 87 "<tt>&a<code></code>uml;</tt>". --fryed-peach Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Fryed-peach-2009-06-08T15:48:00.000Z-Wrong description11

Desc for error 91 has a word "bedin". I suppose it should be "beginning" instead. --fryed-peach Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Fryed-peach-2009-06-08T16:02:00.000Z-Fryed-peach-2009-06-08T15:48:00.000Z11
You can use &amp;uml; to decode html entity instead of a tag-hack. --Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Umherirrender-2009-06-08T17:49:00.000Z-Fryed-peach-2009-06-08T16:02:00.000Z11
 Ok, I have insert all things. Thanks for this info. Sorry for my broken english. :-) -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-08T19:50:00.000Z-Umherirrender-2009-06-08T17:49:00.000Z11

HTML named entities without semicolon

Script find links. See [8]: all are false positives. Matma Rex answer me on plwiki Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Matma Rex-2009-06-08T13:20:00.000Z-HTML named entities without semicolon11

I have deactivated this section. I will make a better version. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-09T06:52:00.000Z-Matma Rex-2009-06-08T13:20:00.000Z11

A slightly odd request

I have another request related to DEFAULTSORTs. The English Wikipedia's guidelines at en:WP:CAT#Using sort keys states" 'Don't begin sort keys with lower case letters, unless you want to create a separate sublist (the ordering places lower case letters after all capital letters). To ensure that entries differing by letter case appear together, apply the convention that initial letters of words are capitalized in the sort key, but other letters are lower case. For example, use 'Dubois' in sort keys rather than 'DuBois'."

My bot has been approved to add and modify DEFAULTSORTs to ensure that they are inline with this guideline. To aid in finding articles which need a DEFAULTSORT because of this (or need a current DEFAULTSORT modified because it isn't in line with this), it would be much appreciated if a scan could be made which would check for article titles which:

A) Contain one or more words which start with lowercase letters, but have no DEFAULTSORT, or have a DEFAULTSORT which contains lowercase letters at the start of a word. For example, en:Role-playing game, en:2004 in film, which should have DEFAULTSORTs of "Role-Playing Game" and "2004 In Film".

B) Contain one or more words with capitalization in the middle of the word, but have no DEFAULTSORT, or have a DEFAULTSORT which contains capitalization in the middle of a word. For example, en:Lewis DuBois, en:SSX, which should have DEFAULTSORTs of "Dubois, Lewis" and "Ssx".

These DEFAULTSORTs aid in category organization because capital letters are listed before lowercase letters be default, but this causes some odd sorting issues.

I know that this is a rather odd request and it certainly isn't going to be needed in every language, but it would be very much appreciated for the English Wikipedia. Thanks! -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-06T16:50:00.000Z-A slightly odd request11

Very interesting. I will build two new errors for you. Maybe tomorrow. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-06T19:33:00.000Z-Drilnoth-2009-06-06T16:50:00.000Z11
Thank you! I know that these will be very long lists, but it will be much appreciated whenever you have a change. (oh, and the third paramter of the "Liftime" template also functions as a DEFAULTSORT, if it's not too hard to code that into your script to). -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-06T23:25:00.000Z-Stefan Kühn-2009-06-06T19:33:00.000Z11
 Ok, A and B is a new error. Today I have no time for "Liftime" -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-07T20:14:00.000Z-Drilnoth-2009-06-06T23:25:00.000Z11
Okay; thanks! -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-08T01:05:00.000Z-Stefan Kühn-2009-06-07T20:14:00.000Z11

Would it be possible for this error (and the other DEFAULTSORT-related ones) to exclude any pages with the text "#REDIRECT"... redirects almost never have categories, so having them listed is kind of pointless (they don't need DEFAULTSORTs). Alternatively, you could only list articles as having such errors if they contain "[[Category:" in them. -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-08T21:43:00.000Z-A slightly odd request11

In dewiki we have many categorys and Persondata in REDIRECT-Articles. There we need this. But the alternate way is possible. I will try this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-09T06:51:00.000Z-Drilnoth-2009-06-08T21:43:00.000Z11
Okay; sounds good. It just seems pretty pointless to add DEFAULTSORTs to pages which don't have categories. :) -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-09T14:08:00.000Z-Stefan Kühn-2009-06-09T06:51:00.000Z11
 Ok, there must be a category in an article for error 91. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-09T19:57:00.000Z-Drilnoth-2009-06-09T14:08:00.000Z11

Thank you

Thank you for trying to figure out a way to reduce this CPU usage problem. I don't know if its good or bad that enwiki has so many errors that we can just keep using this same list for days while you fix it. :) -Drilnoth (Talk) Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Drilnoth-2009-06-15T02:21:00.000Z-Thank you11

No, it is not a problem of en. It is a problem of the toolserver and of my script, when it get the articletext. At the moment I use the API with the statement:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&titles=Paris

So I can only get one article. Better is to make it like this:

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&titles=Paris%7CDresden%7CBerlin

I will try to insert this in the script, but this is a bigger work. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-16T19:39:00.000Z-Thank you11

Just a thought though I don't exactly now how the script works and how access rights on toolserver are defined: would it be possible to request the full text of the set of articles from the toolserver copy in MySQL? -- User:Docu

Hello Docu, I don't use MySQL. I only use the API (see above) and perl. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-16T19:40:00.000Z-Thank you11
It might be a way to limit the resources being used. It would retrieve the full text of the articles to check all at once, directly from the database. -- User:Docu

Problem Erstellung html-Ausgabe nach Anpassung aufgrund hohem Ressourcenverbrauchs

Hallo Stefan! Nach dem Workaround aufgrund der hohen CPU Belastung wird zwar die Textdatei erstellt (z. B. http://toolserver.org/~sk/checkwiki/dewiki/dewiki_output_for_wikipedia.txt), aber die html Ausgabedateien (http://toolserver.org/~sk/checkwiki/dewiki/dewiki_output_for_wikipedia.html) wird nicht aktualisiert. Dies scheint ein Problem in allen Sprachen zu sein.--Video2005 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Video2005-2009-06-15T19:57:00.000Z-Problem Erstellung html-Ausgabe nach Anpassung aufgrund hohem Ressourcenverbrauc11

Danke für den Hinweis, schau gleich mal woran das liegt. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-16T19:21:00.000Z-Video2005-2009-06-15T19:57:00.000Z11
 Ok, sollte morgen laufen. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-16T19:26:00.000Z-Stefan Kühn-2009-06-16T19:21:00.000Z11
Besten Dank! --Video2005 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Video2005-2009-06-16T20:16:00.000Z-Stefan Kühn-2009-06-16T19:26:00.000Z11

Problems

Did you run a new version of the script? It propably shows bad pairs (article name, error code) in some cases. I've checked some high priority errors on pl.wiki and there was no code as indicated was in your tables (and wasn't in previous versions of the arctiles). Malarz pl Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Malarz pl-2009-06-21T19:25:00.000Z-Problems11

You are not alone! I've asked him the same in German Wikipedia Diskussion:WikiProject Check Wikipedia#2.9 Pre mit undefiniertem Ende - Scriptfehler.3F11-- Ben Ben Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Ben Ben-2009-06-21T21:59:00.000Z-Malarz pl-2009-06-21T19:25:00.000Z11
Malarz pl, I've changed your English, maybe I shouldn't do that - people could think that's impolite. If so, please say it - I wouldn't do that anymore.-- Ben Ben Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Ben Ben-2009-06-21T21:59:00.000Z-Ben Ben-2009-06-21T21:59:00.000Z11
Shit. I have wait for this event, but in my tests I have found no of this problems. I will fix this tonight. See also this Info about the new version of the script. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-22T05:13:00.000Z-Ben Ben-2009-06-21T21:59:00.000Z11
I have stopped the cronjob for today. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-22T05:15:00.000Z-Stefan Kühn-2009-06-22T05:13:00.000Z11
 Ok, I have fixed this problem. This where two problems of the API. 1.) Only 50 articles allowed. 2.) The order of the article in request can be change. I hope it work tonight. I have start a new live scan. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-22T20:55:00.000Z-Stefan Kühn-2009-06-22T05:15:00.000Z11

Pre mit undefiniertem Ende

Info.xml wird erkannt, obwohl es dort kein pre gibt, welches nicht geschlossen ist. Der Info-Text war „<prename>Hansjoerg</prename> <surname>Petry</surname> <street>Gerressener“. Dieser Textausschnitt befindet sich aber innerhalb von source-tags, sollte also dort nicht erkannt werden. Der Umherirrende Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Umherirrender-2009-06-11T22:29:00.000Z-Pre mit undefiniertem Ende11

 Ok, erledigt. --sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-07-01T07:25:00.000Z-Umherirrender-2009-06-11T22:29:00.000Z11

Error 61 could be extended and more efficient

Hi, Stefan!

Your program does not recognize self containing references followed by punctuation char:

<ref name="foobar"/>?

Anyway this 12 index() calls on whole text

	$pos = index( $text, '</ref>.') if ($pos == -1);	
	$pos = index( $text, '</ref> .') if ($pos == -1);
	$pos = index( $text, '</ref>  .') if ($pos == -1);
	$pos = index( $text, '</ref>   .') if ($pos == -1);
	$pos = index( $text, '</ref>!') if ($pos == -1);
	$pos = index( $text, '</ref> !') if ($pos == -1);
	$pos = index( $text, '</ref>  !') if ($pos == -1);
	$pos = index( $text, '</ref>   !') if ($pos == -1);
	$pos = index( $text, '</ref>?') if ($pos == -1);
	$pos = index( $text, '</ref> ?') if ($pos == -1);
	$pos = index( $text, '</ref>  ?') if ($pos == -1);
	$pos = index( $text, '</ref>   ?') if ($pos == -1);

could be replaced with a single regular expression match like

	$text =~ m|</ref> {0,3}[?!.]|;

or even more

	$text =~ m|</ref> *[?!.,]|;

I guess it is a bit faster (especially if no hit) and defintely more scalable.

Cheers

Bitman --193.6.17.70 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-193.6.17.70-2009-06-28T11:52:00.000Z-Error 61 could be extended and more efficient11

Thanks for this info. I will try this, but I am not a perl-Guru. Only learning by doing. I will test this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-07-01T07:19:00.000Z-193.6.17.70-2009-06-28T11:52:00.000Z11

misused id or class

Hello, can you add new feature - detecting articles containing <span id="foo"> or <span class="foo"> and the same for <div>. Sometimes there are misused some classes like this. I think there is no need to use it in text, maybe in code or templates. JAn Dudík Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-JAn Dudík-2009-06-30T06:20:00.000Z-misused id or class11

Ok, I will try this at in the next days. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-07-01T07:20:00.000Z-JAn Dudík-2009-06-30T06:20:00.000Z11

Categories

Can you sometimes run your script to categories too? some errors are the same. JAn Dudík Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-JAn Dudík-2009-06-30T06:20:00.000Z-Categories11

My script work with the dumps. It scan all pages. Also Categories. But in the most errors I only check the namespace 0 (articles) and namespace 6 (images). Not more. Which errors should also check in namespace 14 (category)? -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-07-01T07:24:00.000Z-JAn Dudík-2009-06-30T06:20:00.000Z11

kleiner Fehler und Funktionswunsch

  • ist es möglich Menschenlesbarkeit noch aus dem ISBN-13 rauszunehmen?
  • und könntest du unter der Tabelle noch drei Zeile anhängen die jeweils für die einzelnen Prioritäten die Summen angeben? So könnte man sehen in welchen Bereich welche Anzahl von Fehler drinstecken. Bsp:
nr. name script dewiki previous scan last scan trend change
Summe -- -- low 1000 800 -200
Summe -- -- middle 2000 2050 50
Summe -- -- high 500 123 -377

--Goforgold Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Goforgold-2009-06-25T13:06:00.000Z-kleiner Fehler und Funktionswunsch11

Ich möchte nochmal kurz auf Menschenlesbarkeit hinweisen (schaut so aus, als ob der vergessen wurde..) und auf dies, sowie dies hinweisen. Ich vermute mal das Skript ist irgendwo Amok gelaufen. Gruss -- Goforgold Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Goforgold-2009-07-06T15:10:00.000Z-kleiner Fehler und Funktionswunsch11

Danke für den Hinweis. Das mit Menschenlesbarkeit mach ich nocht. Das andere sind nur ein Schluck-Auf des Skriptes. Wird sicherlich beim nächsten Lauf nicht drin sein. Muss irgendwas mit der API gewesen sein oder Netzwerkprobleme. Am Skript selbst hab ich in den letzten Tagen aus Zeitmangel nichts machen können. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-07-06T15:31:00.000Z-Goforgold-2009-07-06T15:10:00.000Z11

Error #003 in Turkish Wikipedia

Good evening. Could you please modify the script so that it not only searches for <references /> but also {{reflist}}? The reason why this is required is that the output shows the valid pages (with reference tags) as if they do not have any reference tags. Thanks! ----Superyetkinileti 20:07, 31 May 2009 (UTC)

Hello Superyetkin, this is already feature of the script to search for "reflist". Can you tell me the article where the script don't found the reflist-template. I think there is an other problem. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-01T15:48:00.000Z-Error #003 in Turkish Wikipedia11
Hello there, sorry for the late reply.
The problem is that this template, which is used in featured articles, is not recognized by the script and these articles come up with "missing <references />" errors on the project page. You can see the current situation here Could you please examine the issue and resolve it? Thanks for your help. --Superyetkin Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Superyetkin-2009-07-09T19:10:00.000Z-Stefan Kühn-2009-06-01T15:48:00.000Z11

Error #003 in Turkish Wikipedia

Good evening. Could you please modify the script so that it not only searches for <references /> but also {{reflist}}? The reason why this is required is that the output shows the valid pages (with reference tags) as if they do not have any reference tags. Thanks! ----Superyetkinileti 20:07, 31 May 2009 (UTC)

Hello Superyetkin, this is already feature of the script to search for "reflist". Can you tell me the article where the script don't found the reflist-template. I think there is an other problem. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-06-01T15:48:00.000Z-Error #003 in Turkish Wikipedia 211
Hello there, sorry for the late reply.
The problem is that this template, which is used in featured articles, is not recognized by the script and these articles come up with "missing <references />" errors on the project page. You can see the current situation here Could you please examine the issue and resolve it? Thanks for your help. --Superyetkin Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Superyetkin-2009-07-09T19:10:00.000Z-Stefan Kühn-2009-06-01T15:48:00.000Z-111
 Ok, I have insert the "Şablon:Kayan kaynakça". -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-08-03T19:55:00.000Z-Superyetkin-2009-07-09T19:10:00.000Z11
Thanks! --Superyetkin Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Superyetkin-2009-08-04T08:03:00.000Z-Stefan Kühn-2009-08-03T19:55:00.000Z11

Falsches Gradzeichen

Hallo Stefan, könntest Du für WP:BA #Nummernzeichen statt Gradzeichen nicht schon eine Liste erstellen. Das macht sich leichter, als ggf. die ganze Datenbank zu durchlaufen wenn man keinen Dump nutzt. -- @xqt Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Xqt-2009-06-12T06:55:00.000Z-Falsches Gradzeichen11

Ich hab es auf meine To-do-list gesetzt. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-08-03T19:59:00.000Z-Xqt-2009-06-12T06:55:00.000Z11
Ich lass das mal lieber raus, weil ich nicht weiß ob auch in anderen Sprachen das so gehandhabt wird. Scheinbar hat das ja mit dem Bot geklappt. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juni#c-Stefan Kühn-2009-08-18T19:07:00.000Z-Stefan Kühn-2009-08-03T19:59:00.000Z11