I have the notion I might shadow this project with a parallel thread, in Chat so if subjects deemed "current" politics whatever that means come up people can freely adduce them. With courtesy and civility I trust.
Toward that end I have revamped the Wyoming and Vermont spreadsheets, to bring them into greater harmony with the fact that people do vote third party/independent/write-in "other" here in the USA despite the disincentives rendering it pretty quixotic to do so. In a system that attempts guarantee close proportionality for all, not just the largest (and the logic of going proportional at all suggest no excuse for not doing so to me) presumably a lot more people would break loose of the duopoly mold. Whereas observe that so far, there is no reason to think that even having about 25 districts for every current one would really open the door for many third party candidates! There certainly are independent candidates who could handily carry any of your small districts--but their voter base would be cut back to 4 percent too after all. I would limit my comments here to occasional notices of updates there--and to stuff that is directly relevant to what you are doing here, strictly.
Vermont immediately raises such issues! I think my project goes far to vindicate what you are doing here, actually. But it is entirely possible to get hard data on how people voted in 2016 for Representative without having to infer it from PVIs that attempt to average out several diverse elections before 2016, some of which are statewide or even nation wide.
Since the 1920s, the US House of Representatives Clerk's Office has been tabulating results of all Federal elections, including names and party affiliations of most minor candidates--generally speaking the exception is write-ins which are not reported by name. They collect official reports from each state, as well as the Territories, and these reports
can be accessed here. They are published in PDF form, in a format that unfortunately cannot be pasted into spreadsheets or they'd be a heck of a lot more convenient! Other Federal databases provide even more extensive information, in spreadsheets that are pretty massive and also not organized in a terribly convenient form either. At the pace this project is going to go, I think we can just write in the data state by state.
So referring to that, first of all a significant number of Wyoming voters voted for either the Constitution or Libertarian party candidate in the race for Wyoming's single representative--in numbers that make a visible dent in the Republican majority but would not alter your outcomes. I am developing a method of reconciling your two-party PVI based estimates of the small district balances with the larger OTL district overall outcomes recorded, allowing some semi-coherent way of discounting the duopoly votes and guided speculative third party votes. In Wyoming, I assumed that overall the likelihood of a Constitutionalist or Libertarian voter voting in a particular district would be connected to the strength of Republican dominance there, on the theory that the more rightward an electorate leans, the more comfortable larger numbers would be voting for these alternatives, in part because they would not fear the spoiler effect would flip the district.
Against that, again in a region where conservatives are greatly outnumbered, they might figure the Republican is doomed anyway and have nothing to lose by registering their more nuanced version of rightism. A more reasonable approach then might be to simply note which districts are most competitive and polarize them to the duopoly race as both sides think victory might be in their grasp with enough unity.
However in my project I propose to have districts as base units exactly conforming to yours in outcome, so the net outcomes can be compared, and that means duopoly logic as we know it historically is in full play, whereas given the strong incentives to choose one of the two dominant parties in voting, people who do vote third candidate anyway presumably are motivated by pretty deep reservations about both mainstream parties.
So, I multiply the ratio of Republican voters, inferred from your PVI, to the total times a random number, then apply corrections to bring all parties in line with the House Clerk's report of totals.
When we get to districts where identified individual independents have run and got significant votes, I'll need to do something more focused--my plan now is to identify which district your method would have the real world indie running in, give them either half the votes they got OTL or a bit under whatever would tip the balance from what your PVI duopoly outcome is, route another quarter of their real world votes to the other small districts of yours I consolidate into STV groupings, plus any overflow from the home district, and then the final quarter from history plus any overflow from the STV district that can't be allowed because it would tip the balance differently than you find goes to all the other districts of yours within the OTL big district. This amounts to assuming an independent or regional party that can mount a campaign that shows up with significant votes would center a multi-district campaign with many candidates allied to the historic independent, strongly supported in the near region of the independent's home district, and having supporters scattered across the OTL CD as well.
I've also decided that after distributing the inscrutable "Other/Write-in" votes randomly across the districts, I will assume all of them, in a ranked choice STV system, would vote for the duopoly candidate who tends to come in second in their district. It might be more reasonable to just nullify them, on the theory each of them is essentially random. But in studying the nature of larger independent campaigns I do think I can show a tendency for their supporters to be contrarian to whomever dominates their district. Bearing in mind I am not doing this in a FPTP context I think my approach on this can be seen as reasonable.
So when we move on to Vermont, looking at the OTL actual returns for the Representative race there, right away we run up against the wall that no Republican candidate ran to receive any votes whatsoever in that state, for that office! We will be facing this kind of thing now and again as we go, there are many districts where a candidate ran unopposed, or where as in Washington and California the electoral procedure is based on a "top two" open primary, where in those states it is not uncommon for the top two primary candidates to be rivals in the same party, so either a Republican voter in some districts faces endorsing one of two Democrats in the general election, or the other way round (more rarely in those two states, but a Louisiana voter faces the opposite situation, in a different form of jungle primary--the general election is the jungle primary, and then if no candidate got over 50 percent, a runoff is held between the top two--in LA, invariably between two Republicans). We need some kind of method to deal with this!
Your PVI approach sails right past, asking the narrow question of voters, based on how they did vote in elections most of which had D vs R races, which of the two ways they lean.
To try to reconcile an election where it is plausible, given 21 districts to vote in, a quarter or more of VTs voters will vote Republican, with data from a statewide race with no Republican to vote for, my first lifeline was to note that just shy of 30,000 votes were "blank." Generally I ignore blanks, and "spoiled/scattered" ballots, and just look at the ballots that were actually counted for someone or other. But in this case, I think we can infer a great many of those 30K ballots were in fact cast by Republican supporters refusing to either vote for a Democrat or settle for a protest vote for a third candidate. In VT, the third party in the House race was the state local Liberty Union party, which I assume veers left in the same sense Constitution and Libertarian veer right. A conservative voter would not want to support LU either, so their choices boil down to writing in someone or leaving it blank.
30 K is a lot more than zero but hardly likely to cover all the people who might vote Republican given the chance. We might assume higher turnout than OTL but while that would be valid in a project focused on proportional representation alone, I think, it is speculative and unclear how far we would go, nor how many voters opposed to the conservatives might also be encouraged to vote. I decided that a substantial number of people who did in fact vote for the Democratic candidate might have wanted to vote Republican instead, and did not want to support a quixotic run by a radical right wing third party--perhaps they are moderate Republicans out of step with the national party, or people who would vote Democratic in a more moderate state but do so queasily in a state leaning as far left as Vermont nowadays?
I wound up therefore straight up stealing 65,000 votes from the Democratic rolls and transferring them directly to the Republicans statewide, to be distributed per your PVI formula. The Democrats could take the hit!
In fact, after fluffing up the R's from zero to 90,000 in this way and trimming down the D's to somewhat (not much) less than OTL numbers, and throwing a third party I indexed to Democratic strength, it turned out the D's would sweep not 18 of 21 but all 21 seats in the state. I then had to go in to the three districts you had Republicans winning in and hand-adjust them to transfer votes out of the D column and into the R column ( also taking advantage of the opportunity to transfer some to either Liberty Union or "other/write in."
In doing this I noticed, by your PVI index alone, you had only one clearly Republican district, #7 with a mere 7 percent advantage. That was swamped out too. I made sure make it a slim R plurality win. But your other two VT Republicans both come from districts where your PVI is zero, a dead heat between the two parties. There is no random element to my R and D numbers, so I initially substituted a 0.01 R bias to them, which would be enough to guarantee victory for the R, but again they were of course swamped out.
It is not unreasonable of you to have fudged both those seats for the R's; in real life, the Republicans, observing hopeless odds in the other 18 seats, would presumably focus all campaigning effort on these three alone. So could the Democrats of course, but being even odds shows being out of step with larger state majorities, which could make the D's complacent and the R's extra-determined to win where they can, tipping the balance perhaps.
I'm just noting, it was necessary for me to hand-nudge all three R districts, and you had to do that with 2 of 3 as well, to get any representation for the Republicans.
And yet, when the dust settles, my rather Rube Goldberg approach to a fictitious Republican block of votes turns out to line up pretty exactly with your statewide PVI estimate of 6 Republicans for the state as a whole in a proportional system! In fact with PR, that is the number out of 21 the Republicans would win. (Liberty Union would also win one or two seats as well).
No third party so far would win any seats either with your numerous FPTP districts or my average of 5 of yours to form a larger STV district. Either way, third parties tend to wind up just enhancing the numbers of whatever mainstream duopoly party they lean toward. Only going to direct and unrestricted PR opens the door to any--so far, one Constitution Party rep from Wyoming and two Liberty Union from Vermont. These would be elevated to office by increasing the total delegation of the two states so far contesting beyond 40 by the way. As my idea of how such kicking up should work relates to integration on the national, not state scale, technically I can't have more than an estimate how may there would actually be, and they'd be accompanied by either more Republicans or more Democrats, and possibly a few extra for both.
I am going to try to catch up to the states you have already posted to this point and then launch the proposed Chat thread.