Benutzer Diskussion:Stefan Kühn/Check Wikipedia/Archiv/2009/Juli

aus Wikipedia, der freien Enzyklopädie
Letzter Kommentar: vor 14 Jahren von JoRobot in Abschnitt Error #69
Zur Navigation springen Zur Suche springen

Exclusion list

Hello,

I think an exclusion list (page not scanned by your script) per error will help us. Some examples (discussed on French wiki)

  • for error 37 when we don't want to add DEFAULTSORT on sinogrammes, kanji, ...
  • for error 30 when some files should not have a description (especially image links on some models named {{Infobox ...}}) ...
  • for links to other namespaces (on discussion page, we said that sometimes a change is needed, sometimes not)


And also for false positives fr:Travail des enfants, fr:Oiseau (squelette)

We could add these pages (after verifying them) on an exclusion list per error (like Projet:Correction syntaxique/Exclusion list/30 for error 30, ...), and your script should ignore pages listed in exclusion list. What do you think about that ?

Thanks. Al1 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Al1-2009-07-01T14:46:00.000Z-Exclusion list11

Hello Al1, we had in DE also a discussion about a whitelist for every error. Now you want in FR also a whitelist. I had the idea for WikiProject Check Wikipedia/Whitelist#36. In the next months I will try to make this possible. At the moment I am very busy (privat and at work). Please write the exlusions in this time in the description (like in EN). -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-01T20:36:00.000Z-Al1-2009-07-01T14:46:00.000Z11

Error 083 - possible bug (WP in Italian)

Hello, this short message to inform you that the Check Wikipedia script run on Wikipedia in Italian flagged for an error 083 (Headlines start with three "=" and later with level two) on the article "it:Episodi di Pocket Monsters Diamond & Pearl" where actually the first headline starts with two "=" but it is inserted within a "noinclude" pair like that: <noinclude>== Title ==</noinclude>. The article was derived by "stripping" part of the contents from a very long original one and the "noinclude" is useful for a correct handling of nested articles. IMHO, if possible and if this does not conflict with other processing of the script, the "noinclude" should be ignored so that the script detects the logically correct sequence of headline levels. Thank you very much and keep up with this precious job. -- L736E Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-L736E-2009-07-08T16:46:00.000Z-Error 083 - possible bug (WP in Italian)11

Hello L736E, I think I need this detection of "noinclude" for other thinks. But also I think this headline inside a noinclude is a bug in the article. At the moment I have no idea, how to fix this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-13T19:21:00.000Z-L736E-2009-07-08T16:46:00.000Z11

Error #033

Hello Stefan. As far as I can see here, there is no wiki syntax routine to replace underlined text (<u>) so what is the use of this error? --Superyetkin Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Superyetkin-2009-07-13T16:08:00.000Z-Error #03311

See here. The underline is a tag which will not supported in the future of html. If you really need this in a article than it should stand in span. This is XHTML-conform. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-13T19:37:00.000Z-Superyetkin-2009-07-13T16:08:00.000Z11
I'm sorry to interrupt. But that is bullshit. You have no right to force people to use span-tags in stead of u-tags. Mediawiki should simply keep supporting u-tags, end of story. Follow the kiss principle. -- chemiewikibm cwbm 22:31, 13. Jul. 2009 (CEST)
You can disable this check on your wiki if you don't want to use it. Just set the priority in the _xx part of the translation text. --Vina Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Vina-2009-07-14T06:51:00.000Z-Stefan Kühn-2009-07-13T19:37:00.000Z11
Thanks for the clarification, Stefan. I would really appreciate it if you answered my other query about the error #003 above. Cheers! --Superyetkin Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Superyetkin-2009-07-13T21:08:00.000Z-Stefan Kühn-2009-07-13T19:37:00.000Z11

Error #040 in Japanese Wikipedia

Hello, Stefan!

In jawiki, there are a lot of HTML font tag, but script reports no font tag. This problem was reported until 2009-01-30 version. but it occured next 2009-02-11 version. Best regards! --Mymelo Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Mymelo-2009-07-11T01:30:00.000Z-Error #040 in Japanese Wikipedia11

Thanks for this info. Also in other languages there are no errors. I will check this in the script. Maybe I an fix this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-13T19:35:00.000Z-Mymelo-2009-07-11T01:30:00.000Z11
Many thanks for your comment. --Mymelo Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Mymelo-2009-07-18T11:40:00.000Z-Stefan Kühn-2009-07-13T19:35:00.000Z11
 Ok, I have change the script a little bit. No I search not only for "<font>". It will also searched for "<font ...". Maybe this help. We will see this tomorrow.

Many suggestions

Hello Mr. Kühn,

While editing on the French Wikipedia, I found many possible errors. I list an example of each.

  • HTML entity &#x2200; (∀) should be translated into Unicode character \u2200.
  • HTML entity &#2200; (࢘) should be translated into Unicode character \u0898.
  • Many HTML entities, like &Eacute (É), should be converted to Unicode. However, &nbsp; must be excluded, since it has legitimate use. A list of such entities is given by the Web Design Group. I have the full list in a JavaScript file, I can send it to you (I use it within my Firefox extension, Weekedit). They are listed on the EN.WP : en:List of XML and HTML character entity references.
  • If the title of the article is PSoC, the sort key should be {{DEFAULTSORT:Psoc}}.
  • A category sort key with a diacritic, like [[Catégorie:Acteur français|Depardieu, Gérard]], is bad.
  • The wikilink [[fractale|fractales]] should be shorter : [[fractale]]s.

Regards,

Cantons-de-l'Est Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Cantons-de-l11

Very interesting ideas. I will try to insert this in my script. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-20T20:01:00.000Z-Cantons-de-l11
A comment: I don't think the last one is a good idea, because it will increase the number of false positives of a spell-checker. --129.215.104.155 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-129.215.104.155-2009-07-21T10:36:00.000Z-Stefan Kühn-2009-07-20T20:01:00.000Z11

Error #003 in Japanese Wikipedia

Hello, Stefan.

In jawiki, there are error reports on error #003, but 2 article is fixed by reference tamplate.

ja:国際水泳連盟 has template {{脚注リスト}}, that is new redirect for {{Reflist}} template. Please set your script for Japanese localise. But I am not find out ja:八王子市's problem. Best Regards. --Mymelo Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Mymelo-2009-07-18T11:11:00.000Z-Error #003 in Japanese Wikipedia11

I will insert this at the next weekend. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-20T20:02:00.000Z-Mymelo-2009-07-18T11:11:00.000Z11
I will also check ja:八王子市 at the weekend. At the moment I don't find a problem.-- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-20T20:05:00.000Z-Stefan Kühn-2009-07-20T20:02:00.000Z11
Stefan. Could you be so kind to do the same (update your script) for Turkish wiki as well? Actually, I had mentioned this before (see my above posts) but you do not seem to have recognized them at all. Thanks for your help. --Superyetkin Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Superyetkin-2009-07-20T20:48:00.000Z-Stefan Kühn-2009-07-20T20:05:00.000Z11
There is only one script. It work in all languages. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-03T20:12:00.000Z-Superyetkin-2009-07-20T20:48:00.000Z11

non detected templates for # 34

Hello, your script should detect using of {{#ifexist|a|b}} templates like here JAn Dudík Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-JAn Dudík-2009-07-23T06:50:00.000Z-non detected templates for # 3411

 Ok, I insert this in the script. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-03T20:08:00.000Z-JAn Dudík-2009-07-23T06:50:00.000Z11

Special characters in interwiki

Can you detect special characters in interwiki like after this edit? JAn Dudík Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-JAn Dudík-2009-07-23T09:04:00.000Z-Special characters in interwiki11

Good idea. I write this at my to-do-list. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-04T06:59:00.000Z-JAn Dudík-2009-07-23T09:04:00.000Z11

More flexible error list?

Dear Stefan!

At this moment one of your programs gets file http://toolserver.org/~sk/checkwiki/huwiki/huwiki_translation.txt and puts content of fields error_XXX_head_script into the final error list. If no error fount field content is copied unchanged otherwise it is surrounded with [[...]] markers.

I would add additional info this table column but the above mechanism does not allow it. However I'd have a suggestion. If you find a template called chkwiki here you should not add square brackets but overwrite the first parameter with error count. I mean something similar:

error_XXX_head_script={{chkwiki|count=N|errno=XXX|msg0=cell_text_A|msg1=cell_text_B}}

Error count should be put in place N. At this point we could write arbitrary templates that changes the displayed text according to error count. Sky is the limit. :-)
However if translated text does not begin with {{chkwiki|count=N|... your program would apply the current algorithm. This way compatibility is preserved with current style translations.

What is your opinion? -- Bitman Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Bitman-2009-07-29T06:01:00.000Z-More flexible error list?11

The problem is that every language need this template. Every change must be change in all language. At the moment we have more then 30 languages. But in the future it will be more then 30. I think this is not practicable. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-04T07:57:00.000Z-Bitman-2009-07-29T06:01:00.000Z11

Uhmm... I don't understand what you mean. Could you show an example? AFAIK the solution I suggested is totally independent on number of checked wikis and languages. It is not necessary to write such a template in every wiki. If somebody needs it he uses it. Other wiki maintainers do not care with it. They get the current internationalised error list. -- Bitman Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Bitman-2009-08-04T16:45:00.000Z-More flexible error list?11

Of course national templates are created and maintained by local people. You have nothing to do with them. --Bitman Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Bitman-2009-08-04T16:55:00.000Z-Bitman-2009-08-04T16:45:00.000Z11
Ok, I understand. You mean I should update the script so that every language can use an own template inside the translation. If I understand you right then is the problem the [[...]]. But I don't understand what do you want with this template? I use the "error_XXX_head_script" only as headline and inside the statistic table. Please describe me better the "Sky" :-) -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-04T19:50:00.000Z-Bitman-2009-08-04T16:55:00.000Z11
Yes, the problem is that there is no way to adapt to [[...]] placed (or not) by your program. A localized template however would apply (or not) square brackets where necessary depending on error count meanwhile other elements of the table cell remain fixed.
Actually I want to add a warning icon to items that are bot correctable so human editors would not waste their time by editing these trivial errors.
Another advantage: now our editors remove manually the wikilink leaving the plain text in table cell after fixing every errors of a certain kind. It is faster and easier a bit to change template parameter count from 11 to 0.
The sky? Version 2.0 of the template may also insert a smiley into the cell of solved errors. --Bitman Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Bitman-2009-08-07T14:25:00.000Z-Bitman-2009-08-04T16:45:00.000Z11

White space detection

Hello, can you insert new error - articles with long text with whitespace at teh begining of line. Whitespace canbe used

for scripts or something like,

but I think this scripts might be shorter than e.g. 80 characters. JAn Dudík Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-JAn Dudík-2009-07-23T06:46:00.000Z-White space detection11

I have also this idea, but I have no good algorithmen to detect this. There are too many problems at the moment. For example source or templates. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-04T06:57:00.000Z-JAn Dudík-2009-07-23T06:46:00.000Z11
pywikipediabot uses serveral exceptions for text inside various tags, maybe you can give a look at it. --Nemo bis Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Nemo bis-2009-08-11T23:37:00.000Z-Stefan Kühn-2009-08-04T06:57:00.000Z11

Error #082 on Swedish Wikipedia

Links starting with "S:", like in [[S:t Lukasstiftelsen]] is not a link to any other wikimedia-project from the swedish wikipedia, because there are many names starting with S:t in swedish. Best regards! -- Lavallen Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Lavallen-2009-07-10T19:51:00.000Z-Error #082 on Swedish Wikipedia11

Ohh, very interesting. I think this is the short link to Wikisource. What did you use as shortlink to Wikisource in Swedish Wikipedia? -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-13T19:31:00.000Z-Lavallen-2009-07-10T19:51:00.000Z11
The Swedish Wikipedia use "src" as shortlink to Wikisource. Elfsborgarn Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Elfsborgarn-2009-07-16T11:33:00.000Z-Stefan Kühn-2009-07-13T19:31:00.000Z11
I see you deactivated this in svwiki. It would be many work to include this in the script. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-18T19:10:00.000Z-Elfsborgarn-2009-07-16T11:33:00.000Z11

False positive #81

Dear Stefan! Article hu:Stadler FLIRT contains some extreme large references with embedded tables. Tables are different in ref#17-ref#19 but your script reports them identical. -- Bitman 193.6.17.154 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-193.6.17.154-2009-07-16T16:58:00.000Z-False positive #8111

I delete the table for my script and so the references are identical. I never see an reference like this. Why do you need this? I think a reference should only get a link to a source. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-16T18:21:00.000Z-193.6.17.154-2009-07-16T16:58:00.000Z11

I can't answer, I'm not editor of the article. I write a modular bot to repair errors discovered by you. Repairing #81 is quite complicated but not impossible: hu:User:GumiBot/code81. I think exact detection of identical refs may be less hard. :-) -- Bitman 193.6.17.197 Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-193.6.17.197-2009-07-17T05:08:00.000Z-False positive #8111

Bug from Danish Wikipedia

After editing some ref's from #81 in a couple of articles, new ref-bugs from the same articles (da:Jehovas Vidner and da:Dansk køkken)appeard - but they weren't added to the article after I edited them. So the conclussion must be that #81 doesn't catch more than one bug from a articel at a time :) --Anigif Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Anigif-2009-08-12T21:18:00.000Z-Bug from Danish Wikipedia11

Yes this is right. I give only the first double ref. Because sometime many of them in on article. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-18T19:29:00.000Z-Anigif-2009-08-12T21:18:00.000Z11

Error #69

Hi, there may be a false positive in it:Codice ISBN as one image name contains "ISBN-13". That's my guess, could you check it as well? Marcol-it

I can insert this article as exclude article. I do this with article ISBN in de. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-20T20:06:00.000Z-Error #6911
Thanks, that will be good! :) Marcol-it 18:08, 21. Jul. 2009 (CEST)
 Ok, I fix this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-03T20:11:00.000Z-Stefan Kühn-2009-07-20T20:06:00.000Z11

We get the same false positive in ca:Lector de codi de barres, and we will get it in ca:ISBN if it gets inspected. Can you please white-list them? --JoRobot Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-JoRobot-2009-09-01T21:45:00.000Z-Error #6911

Other namespaces

Hi Stefan. I think that the script only searches for errors on namespace 0 (principal). It may be nice that, on eswiki at least, it also find in namespace 104 (Anexo:), which is used for lists and can have the same errors to fix. Can your script scan namespaces 0 and 104 next times? Thanks in advance! Muro de Aguas Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Muro de Aguas-2009-07-09T17:07:00.000Z-Other namespaces11

Hello Muro de Aguas, thanks for this info. I never heard about this namespace 104. This is very interessting. I will try to include this in the next time. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-07-13T19:24:00.000Z-Muro de Aguas-2009-07-09T17:07:00.000Z11
I write this at my To-do-list. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-03T20:00:00.000Z-Stefan Kühn-2009-07-13T19:24:00.000Z11
 Ok, I have include this. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-18T19:06:00.000Z-Stefan Kühn-2009-08-03T20:00:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:00, 4. Okt. 2009 (CEST)

Suggests from France: Image without description

This detection returns a lot of false-positive errors. The problem have been suggested on french Project:CheckWiki discussion page and agreed for suggestion. When an image is used as a simple image, or in infobox (and "Template"), image description become as alternate despcrition. Alternate description problem is complicated and different from the need of description in a "thumb" image. "Image with really description needed" and "image with description not needed" are melted (I know alternate description is needed, but it's another problem). We'd like to modify the error 30, 2 ways have been suggested :

  1. Detect only description really needed : Thumb and gallery (gallery is allready detected) -> so only when "thumb" is added to the image. Simple image and image in infobox (and template) can be forgotten.
  2. Divide error in two pieces : same detection (only for thumb) and keep detection for simple image and image in infobox (and template).

The first one could be the best (at least for France) while at this moment the problem resolution of the alternate description of image is not going to happen soon. This fix could makes image problem easier, do you agree with this changes ?

Maybe ignore for this pictures with size smaller than 50 px? JAn Dudík Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-JAn Dudík-2009-07-23T06:48:00.000Z-Suggests from France: Image without description11
The script is stupid. It only detect images in the text. Every image should have a description. Also the very small one. Yes this is much work. The only way is to divide this error. One with only thumbs and the rest. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-04T07:09:00.000Z-JAn Dudík-2009-07-23T06:48:00.000Z11
Splitting it in two errors is fine. This would fix the image really without description at least... "One with only thumbs and the rest", as you say... --Archimëa 17:07, 4. Aug. 2009 (CEST)

Suggests from France: Error 063

<sub><small>testo</small></sub> is detected, but <small><sub>testo</sub></small> isn't. Normal behaviour ? --Archimëa 16:31, 23. Jul. 2009 (CEST)

I'm wrong ? --Archimëa 17:07, 4. Aug. 2009 (CEST)

Suggests from France: Output limit

We'd like to increase the output limit of errors displayed by the script from 50 to 100. Indeed, the old version of the program returned 50 errors each day, while the new version only returns 50 every two days. As a consequence, the total number of errors rise up because errors are created faster than corrected. Could you please increase the maximum number to 100 in order to restore the previous rate/situation ?

--Archimëa 17:23, 21. Jul. 2009 (CEST)

At the moment the biggest list is in enwiki with 272KB. I have only one limit for all languages. Maybe I can change this for frwiki. I will try this. --sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-04T07:16:00.000Z-Suggests from France: Output limit11
Ok, Thx... i wasn't sure that you will agree, i thought this could increase the time scan, and then stress server...
 Ok, I have change the limit from 50 to 100 only for frwiki. -- sk Benutzer Diskussion:Stefan K%C3%BChn/Check Wikipedia/Archiv/2009/Juli#c-Stefan Kühn-2009-08-18T20:26:00.000Z-Stefan Kühn-2009-08-04T07:16:00.000Z11
Dieser Abschnitt kann archiviert werden. sk 21:39, 26. Okt. 2011 (CEST)